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PEPTIDE T.TBPap y AND srPBK NiWG WCTwnn 

CROSS-REFERENCE TO RELATED APPLICATIONS 
The present application is a continuation-in-part of 
USSN 08/290,641, filed 8/15/94, which is a continuation o*= 
USSN 07/963,321, now US 5,338,665, filed October 15, 19 92 " 
which is a continuation-in-part of USSN 07/778,233 , filed' 
October 16, 1991 now US 5,270,170, and is related to copending 
USSN 07/517,659, filed May 1, 1990, and to copending USSN 
07/541,108, filed June 20, 1990, which is a 

continuation-in-part of copending USSN 07/718,577, filed June 
20, 1991, each of which is incorporated by reference in its 
entirety for all purposes. 

FIELD OF THE INVENTION 
The present invention relates generally to methods 
tor selecting peptide ligands to receptor molecules of 
interest and, more particularly, to methods for generating and 
screening large peptide libraries for peptides with desired 
binding characteristics. 

BACKGROUND OF THE INVENTION 
The isolation of ligands that bind biological 
receptors is fundamental to understanding signal transduction 
and to discovering new therapeutics. The ability to 
synthesize DNA chemically has made possible the construction 
of extremely large collections of nucleic acid and peptide 
sequences as potential ligands. Recently developed methods 
allow efficient screening of libraries for desired binding 
activities (see Pluckthun & Ge, Angew . Chem. Int. Ed. Engl. 
30, 296-293 (1991). For example, RNA molecules with the 
ability to bind a particular protein (see Tuerk & Gold, 
Science 249, 505-510 (1990) or a dye (see Ellington & Szostak, 
Mature 346, 818-822 (1990) have been selected by alternate 
rounds of affinity selection and PCR amplification. A similar 
technique was used to determine the DNA sequences that bound a 
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human transcription factor (see Thiesen & Bach, Nucl . Acids 
Res. 18, 3203-3209 (1990)). 

Application of efficient screening techniques to 
peptides requires the establishment of a physical or logical 
5 connection between each peptide and the nucleic acid that 
encodes the peptide. After rounds of affinity enrichment, 
such a connection allows identification, usually by 
amplification and sequencing, of the genetic material encoding 
interesting peptides. Several phage based systems for 

10 screening proteins and polypeptides have been described. The 
fusion phage approach of Parmley and Smith, 1988, Gene 73, 
3 05-318, can be used to screen proteins. Others have 
described phage based systems in which the peptide is fused to 
the pill coat protein of filamentous phage (see Scott & Smith, 

15 Science 249, 386-390 (1990); Devlin et al . , Science 249, 

404-406 (1990); and Cwirla et al . , Proc . Natl. Acad. Sci . USA 
87, 6373-6382 (1990); each of which is incorporated herein by 
reference) . 

In these' latter publications, the authors describe 
20 expression of a peptide at the amino terminus of or internal 
to the pill protein. The connection between peptide and the 
genetic material that encodes the peptide is established, 
because the fusion protein is part of the capsid enclosing the 
phage genomic DNA. Phage encoding peptide ligands for 
25 receptors of interest can be isolated from libraries of 

greater than 10 8 peptides after several rounds of affinity 
enrichment followed by phage growth. Other non-phage based 
systems that could be suggested for the construction of 
peptide libraries include direct screening of nascent peptides 
30 on polysomes (see Tuerk & Gold, supra) and display of peptides 
directly on the surface of E. coll. As in the filamentous 
phage system, all of these methods rely on a physical 
association of the peptide with the nucleic acid that encodes 
the peptide. 

35 There remains a need for methods of constructing 

peptide libraries in addition to the methods described above. 
For instance, the above methods do not provide random peptides 
with a free carboxy terminus, yet such peptides would add 
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diversity to the peptide structures now available for receptor 
binding. In addition, prior art methods for constructing 
random peptide libraries cannot tolerate stop* codons in the - 
degenerate region coding for the random peptide, yet stoo 
5 codons occur with frequency in degenerate oligonucleotides. 
Prior art methods involving phage fusions require that the 
fusion peptide be exported to the periplasm and so are limited 
to fusion proteins that are compatible with the protein export 
apparatus and the formation of an intact phage coat. 
10 The present invention provides random peptide 

libraries and methods for generating and screening those 
libraries with significant advantages over the prior art 
methods . 

15 SUMMARY OF THE INVENTION 

The present invention provides random peptide 
libraries and methods for generating and screening those 
libraries to identify peptides that bind to receptor molecules 
of interest. The peptides can be used for therapeutic, 

20 diagnostic, and related purposes, e.g., to bind the receptor 
or an analogue of the receptor and so inhibit or promote the 
activity of the receptor. 

The peptide library of the invention is constructed 
so that the peptide is expressed as a fusion product; the 

25 peptide is fused to a DNA binding protein. The peptide 

library is constructed so that the DNA binding protein can 
bind to the recombinant DNA expression vector that encodes the 
fusion product that contains the peptide of interest. The 
method of generating the peptide library of the invention 

3 0 comprises the steps of (a) constructing a recombinant DNA 
vector that encodes a DNA binding protein and contains a 
binding site for the DNA binding protein; (b) inserting into 
the coding sequence of the DNA binding protein in the vector 
of step (a) a coding sequence for a peptide such that the 

35 resulting vector encodes a fusion protein composed of the DNA 
binding protein and the peptide; (c) transforming a hast cell 
with the vector of step (b) ; and (d) culturing the host cell 
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transformed in step (c) under conditions suitable for 
expression of the fusion protein. 

The screening method of the invention comprises the 
steps of (a) lysing the cells transformed with the peptide 
5 library under conditions such that the fusion protein remains 
bound to the vector that encodes the fusion protein; 
(b) contacting the fusion proteins of the peptide library with 
a receptor under conditions conducive to specific peptide - 
receptor' binding; and (c) isolating the vector that encodes a 

10 peptide that binds to said receptor. By repetition of the 
affinity selection process one or more times, the plasmids 
encoding the peptides of interest can be enriched. By 
increased stringency of the selection, peptides of 
increasingly higher affinity can be identified. 

15 The present invention also relates to recombinant 

DNA vectors useful for constructing the random peptide 
library, the random peptide library, host cells transformed 
with the recombinant vectors of the library, and fusion 
proteins expressed by those host cells. 

20 

BRIEF DESCRIPTION OF THE DRAWINGS 
Fig. 1 shows a recombinant vector of a random 
peptide library of the invention. In this embodiment of the 
invention, the DNA binding protein is the lacl gene product, 

25 the fusion protein forms a tetramer, and the tetramer 

interacts with the vector and immobilized receptor, as shown 
in the Figure. The library plasmid carries the lacl gene with 
random coding sequence fused to the 3 1 end of the coding 
sequence of the gene, as well as two 2acO sequences. The lac 

30 repressor-peptide fusions produced by the hybrid genes bind to 
the lacO sites on the same plasmid that encodes them. After 
lysis of cells containing the random library, those 
plasmid-repressor-peptide complexes that specifically bind a 
chosen receptor are enriched by avidity panning against the 

3 5 immobilized receptor. Transformation of S. coll with 

recovered plasmids allows additional rounds of panning or 
sequencing of isolated clones. 



10 



15 



WO 96/40987 

PCT/US96/09809 

5 

Pig. 2 [SEQ ID NOS : 1-6 ] shows a partial restriction 
site, DNA sequence, and function map of. plasmid pMC5. 
Hybridization of oligonucleotide ON-332 to oligonucleotides 
ON-369 and ON-370 produces a fragment with cohesive ends 
compatible with Sfil, Hindlll digested plasmid pMC5. The 
ligation product adds sequence coding for twelve random amino 
acids to the end of lad through a six codon linker. The 
library plasmid also contains: the rrnB transcriptional 
terminator, the bla gene to permit selection on ampicillin, 
the M13 phage intragenic region to permit rescue of 
single-stranded DNA, a plasmid replication origin (ori) , two 
lacOs sequences, and the araC gene to permit positive and 
negative regulation of the araB promoter that drives 
expression of the 2acl fusion gene. 

Fig. 3 [SEQ ID NOS : 7-64 ] shows sequences isolated by 
panning with the D32.39 antibody. Each sequence is listed 
with a clone number, the panning round in which the clone was 
isolated, and the result of the ELISA with D32.39 antibody. 
The sequences are aligned to show the D32.39 epitope that they 
20 share (box) . 

Fig. 4 [SEQ ID NOS:33-91] shows the linker sequences 
from vectors pJS141 and puS142. 

Fig. 5 [SEQ ID NO:66]: Arrangement of lac 
headpieces, linkers and displayed peptide in headpiece dimer. 

Fig. 6 [SEQ ID NOS:87 AND 92-122]: Sequences of 
headpiece dimer proteins. (a) Sequence of headpiece domains 
and adjoining linkers as constructed for the headpiece dimer 
linker library. (b) Protein sequence of linker library clones 
isolated after four rounds of panning selection, showing 
linker sequences and residue changes from the original 
headpiece protein sequence where indicated. Unchanged 
residues are marked with a dot " . » ; residue deletions are 
noted with a hyphen »-". ( C ) Protein sequences of clones 
isolated after mutagenesis and four rounds of panning 
selection. Unsequenced positions are noted with question 
marks . 

Fig. 7 [SEQ ID NOS : 123-126 ] : Construction of 
headpiece dimer libraries in vector pCMG14 . (a) Restriction 



25 



30 



35 



WO 96/40987 PCT/US96/09809 

6 

map and positions of genes. The library plasmid includes: the 
rmB transcriptional terminator, the Jbla gene to permit 
selection on ampicillin, the M13 phage intragenic region (M13 
IG) to permit rescue of single-stranded DNA, a plasmid 
5 replication origin (ori) , one Iac0 3 sequence, and the araC 
gene to permit positive and negative regulation of the araB 
promoter driving expression of the headpiece dimer fusion 
gene. (b) Sequence of the cloning region at the 3' end of the 
headpiece dimer gene, including the 'Sfll and Eagl sites used 
10 during library construction. (c) Ligation of annealed 

ON-1679, ON-829, and ON-830 to Sfil sites of pCMG14 to produce 
a library. Single spaces in the sequence indicate sites of 
ligation. 

Fig. 8 [SEQ ID NOS : 127-162 ] : Sequences of D32.39 
15 MAb-specif ic peptides isolated from random libraries after 

four rounds of panning. Peptides derived from the headpiece 
■ dimer library are preceded by "HpD" , sequences from the lad 
peptides-on-plasmids library are preceded by "lad". The 
isolate numbers correspond to those in Fig. 9. The boxed 

2 0 portion represents the alignment of peptide sequence with the 

known D3 2.3 9 monoclonal antibody epitope RQFKWT [SEQ ID 
NO: 56 ] . 

Fig. 9: MBP ELISA using peptides isolated fron 
headpiece dimer and lad peptides-on-plasmids random 
25 libraries. High and low affinity control peptides are 

expressed by plasmids pCMG39 and pCMG33, respectively. pELM3 
negative control encodes MBP with an irrelevant fusion 
peptide. Random library clones are numbered as in Fig. 8. 

3 0 DESCRIPTION OF THE SPECIFIC EMBODIMENTS 

For purposes of clarity and a complete understanding 
of the invention, the following terms are defined. 

"DNA Binding Protein" refers to a protein that 
specifically interacts with deoxyribonucleotide strands. A 
3 5 sequence-specific DNA binding protein binds to a specific 

sequence or family of specific sequences showing a high degree 
of sequence identity with each other (e.g., at least about 80% 
sequence identity) with at least 100-fold greater affinity 
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than to unrelated sequences. The dissociation constant of a 
sequence-specific DNA binding protein to its specific 
sequence (s) is usually less .than about 100 nM, and may be as 
low as 10 nM, 1 nM, 1 pM or 1 fM. A nonsequence-specif ic DNA 
5 binding protein binds to a plurality of unrelated DNA 

sequences with a dissociation constant that varies by less 
than 100-fold, usually less than tenfold, to the different 
sequences. The dissociation constant of a nonsequence- 
specif ic DNA binding protein to the plurality of sequences is 

10 usually less than about 1 ^M. In embodiments of the invention 
in which RNA vectors are used, DNA binding protein can also 
refer to an RNA binding protein. 

"Epitope 11 refers to that portion of an antigen that 
interacts with an antibody. 

15 "Host Cell" refers to a eukaryotic or procaryotic 

cell or group of cells that can be or has been transformed by 
a recombinant DNA vector. For purposes of the present 
invention, a host cell is typically a bacterium, such as an 
E. coli K12 ceil or an E. coli B cell. 

20 "Ligand" refers to a molecule, such as a random 

peptide, that is recognized by a particular receptor. 

"Ligand Fragment" refers to a portion of a gene 
encoding a ligand and to the portion of the ligand encoded by 
that gene fragment. 

25 "Ligand Fragment Library" refers not only to a set 

of recombinant DNA vectors that encodes a set of ligand 
fragments, but also to the set of ligand fragments encoded by 
those vectors, as well as the fusion proteins containing those 
ligand fragments. 

3 0 "Linker" or "spacer" refers to a molecule or group 

of molecules that connects two molecules, such as a DNA 
binding protein and a random peptide, and serves to place the 
two molecules in a preferred configuration, e.g., so that the 
random peptide can bind to a receptor with minimal steric 

35 hindrance from the DNA binding protein. 

"Peptide" or "polypeptide" refers to a polymer in 
which the monomers are alpha amino acids joined together 
through amide bonds. Peptides are two or often more amino 
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acid monomers long. Standard abbreviations for amino acids 
are used herein (see Stryer, 198a, Biochemistry, Third Ed. , 
incorporated herein by reference.) 

"Random Peptide" refers to an oligomer composed of 
two or more amino acid monomers and constructed by a 
stochastic or random process. A random peptide can include 
framework or scaffolding motifs, as described below. 

"Random Peptide Library" refers not only to a set of 
recombinant DNA vectors that encodes a set of random peptides, 
but also to the set of random peptides encoded by those 
vectors, as well as the fusion proteins containing those 
random peptides. 

"Receptor" refers to a molecule that has an affinity 
for a given ligand. Receptors can be naturally occurring or 
synthetic molecules. Receptors can be employed in an 
unaltered state or as aggregates with other species. 
Receptors can be attached, covalently or noncovalently , to a 
binding member, either directly or via a specific binding 
substance. Examples of receptors include, but are not limited 
to, antibodies, including monoclonal antibodies and antisera 
reactive with specific antigenic determinants (such as on 
viruses, cells, or other materials), cell membrane receptors, 
enzymes, and hormone receptors. 

"Recombinant DNA Vector" refers to a DNA or RNA 
molecule that encodes a useful function and can be used to 
transform a host cell. For purposes of the present invention, 
a recombinant DNA vector typically is a phage or plasmid and 
can be extrachromosomally maintained in a host cell or 
controllably integrated into and excised from a host cell 
chromosome. 

The present invention provides random peptide 
libraries and methods for generating and screening those 
libraries to identify either peptides that bind to receptor 
molecules of interest or gene products that modify peptides or 
RNA in a desired fashion. The peptides are produced from 
libraries of random peptide expression vectors that encode 
peptides attached to a DNA binding protein. A method of 
affinity enrichment allows a very large library of peptides to 
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be screened and the vector carrying the desired peptide(s) to 
be selected. The nucleic acid can then be isolated from the 
vector and sequenced to deduce the amino acid sequence of the 
desired peptide. Using these methods, one can identify a 
peptide as having a desired binding affinity for a molecule. 
The peptide can then be synthesized in bulk by conventional 
means . 

By identifying the peptide de novo, one need not 
know the sequence or structure of the receptor molecule or the 
sequence or structure of the natural binding partner of the 
receptor. Indeed, for many "receptor" molecules a binding 
partner has not yet been identified. A significant advantage 
of the present invention is that no prior information 
regarding an expected ligand structure is required to isolate 
peptide ligands of interest. The peptide identified will have 
biological activity, which is meant to include at least 
specific binding affinity for a selected receptor molecule 
and, in some instances, will further include the ability to 
block the binding of other compounds, to stimulate or inhibit 
metabolic pathways, to act as a signal or messenger, to 
stimulate or inhibit cellular activity, and the like. 

The number of possible receptor molecules for which 
peptide ligands may be identified by means of the present 
invention is virtually unlimited. For example, the receptor 
molecule may be an antibody (or a binding portion thereof) . 
The antigen to which the antibody binds may be known and 
perhaps even sequenced, in which case the invention may be 
used to map epitopes of the antigen. If the antigen is 
unknown, such as with certain autoimmune diseases, for 
example, sera, fluids, tissue, or cell from patients with the 
disease can be used in the present screening method to 
identify peptides, and consequently the antigen, that elicits 
the autoimmune response. One can also use the present 
screening method to tailor a peptide to a particular purpose. 
Once a peptide has been identified, that peptide can serve as, 
or provide the basis for, the development ' of a vaccine, a 
therapeutic agent, a diagnostic reagent, etc. 
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The present invention can be used to identify 
peptide ligands for a wide variety of receptors in addition to 
antibodies.' These ligands include, by way of example and not 
limitation, growth factors, hormones, enzyme substrates, 
5 interferons, inter leukins, intracellular and intercellular 
messengers, lectins, cellular adhesion molecules, and the 
liJce. Peptide ligands can also be identified by the present 
invention for molecules that are not peptides or proteins, 
e.g., carbohydrates, non-protein organic compounds, metals, 

10 etc. Thus, although antibodies are widely available and 

conveniently manipulated, antibodies are merely representative 
of receptor molecules for which peptide ligands can be 
identified by means of the present invention. 

The peptide library is constructed so that the DNA 

15 binding protein-random peptide fusion product can bind to the 
recombinant DNA expression vector that encodes the fusion 
product that contains the peptide of interest. The method of 
generating the peptide library comprises the steps of 

(a) constructing a recombinant DNA vector that encodes a DNA 
20 binding protein and contains binding sites for the DNA binding 

protein; (b) inserting into the coding sequence of the DNA 
binding protein in a multiplicity of vectors of step (a) 
coding sequences for random peptides such that the resulting 
vectors encode different fusion proteins, each of which is 
25 composed of the DNA binding protein and a random peptide; 

(c) transforming host cells with the vectors of step (b) ; and 

(d) culturing the host cells transformed in step (c) under 
conditions suitable for expression of the fusion proteins. 
Typically, a random peptide library will contain at least 10 6 

30 to 10 8 different members, although library sizes of 10 s to 
10 13 can be achieved. 

The peptide library produced by this method is 
especially useful in screening for ligands that bind to a 
receptor of interest. This screening method comprises the 

3 5 steps of (a) lysing the cells transformed with the peptide 

library under conditions such that the fusion protein remains 
bound to the vector that encodes the fusion protein; 

(b) contacting the fusion proteins of the peptide library with 
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a receptor under conditions conducive to specific peptide - 
receptor binding; and (c) isolating the vector that encodes a 
peptide that binds to said receptor. By repetition of the 
affinity selection process one or more times, the vectors that 
5 encode the peptides of interest may be enriched. By increased 
stringency of the selection, peptides of increasingly higher 
affinity can be identified. If the presence of cytoplasmic or 
periplasmic proteins interferes with binding of fusion protein 
to receptor, then partial purification of fusion protein- 
ic) plasmid complexes by gel filtration, affinity, or other 

purification methods can be used to prevent such interference. 
For instance, purification of the cell lysate on a column 
(such as the Sephacryl S-400 HR column) that removes small 
proteins and other molecules may be useful. 
15 The recombinant vectors of the random peptide 

library are constructed so that the random peptide is 
expressed as a fusion product; the peptide is fused to a DNA 
binding protein. A DNA binding protein of the invention must 
exhibit high avidity binding to DNA and have a region that can 
20 accept insertions of amino acids without interfering with the 
DNA binding activity. The half -life cf a DNA binding 
protein-DNA complex produced by practice of the present method 
must be long enough to allow screening to occur. Typically, 
the half-life will be at least 15 min and often between one to 
25 four hours or longer. 

Suitable DNA binding proteins for purposes of the 
present invention include proteins selected from a large group 
of known DNA binding proteins including transcriptional 
regulators and proteins that serve structural functions on 
30 DNA. Examples include: proteins that recognize DNA by virtue 
of a helix-turn-helix motif, such as the phage 434 repressor, 
the lambda phage cl and cro repressors, and the E, coli CAP 
protein from bacteria and proteins from eukaryotic cells that 
contain a homeobox helix-turn-helix motif; proteins containing 
35 the helix-loop-helix structure, such as myc and related 

proteins; proteins with leucine zippers and DNA binding basic 
domains such as fos and jun; proteins with 'POU' domains such 
as the Drosophila. paired protein; proteins with domains whose 
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structures- depend on metal ion chelation such as Cys 2 His 2 zinc 
fingers found in TFIIIA, Zn 2 (Cys) 6 clusters such as those 
found in yeast Gal 4 , the Cys 3 His box found in retroviral 
nucleocapsid proteins, and the Zn 2 (Cys) a clusters found in 
nuclear hormone receptor-type proteins; the phage P22 Arc and 
Mnt repressors (see Knight et al . , J\ Biol. Chejn. 264, 
3639-3642 (1989) and Bowie & Sauer, J . Biol. Chem. 264, 
7596-7602 (1989) each of which is incorporated herein by 
reference); and others. Proteins that bind DNA in a non- 
sequence-specific manner are also used, for example, histones, 
protamines, and HMG type proteins- In addition, proteins 
could be used that bind to DNA indirectly, by virtue of 
binding another protein bound to DNA. Examples of these 
include yeast Gal 8 0 and adenovirus E1A protein. Phage coat 
proteins, which associate with DNA by encapsidation of the DNA 
in a phage coat, and are used in the phage display methods of 
screening peptides discussed in the Background are typically 
not employed in the present invention. 

Some DNA binding proteins can be selected from the 
above list by virtue of their possession of a dissociation 
half -life of at least fifteen min . Data on DNA half -lives are 
available for several DNA binding proteins. For example, the 
arc repressor of phage P22 has a dissociation half-life of 
80 min (see, e.g., Knight et al . , J . Biol. Chem. 264, 
3639-3642 (1989), Vershon et al . , J. Mol , Biol. 195, 323-331 
(1987) ) . For other DNA binding proteins, dissociation half- 
life can be determined by standard biochemical procedures 
(see, e.g-.. Bourgeois, Methods Enzymol . 21D, 491-500 (1971) 
(filter binding assay), Knight & Sauer, J. Biol. Chem. 264, 
13706-13710 (1939) (DNA modification protection assay)). 

The lac repressor is one of the many DNA binding 
proteins that can be used in the construction of the libraries 
of the invention. The lac repressor, a 3 7 kDa protein, is the 
product of the E. coll lacl gene and negatively controls 
transcription of the lacZYA operon by binding to a specific 
DNA sequence called 2ac0. Structure-function relationships in 
the lac repressor have been studied extensively through the 
construction of thousands of amino acid substitution variants 
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of the protein (see Gordon et al., J. Mol. Biol. 200, 239-251 
(1983), and Kleina & Miller, J*. Mol. Biol. 212, 295-313 
(1990)). The repressor exists as a tetramer in its native 
form with two high affinity DNA binding domains formed by the 
amino termini of the subunits (see Beyreuther, The Operon 
(Miller and Reznikoff, eds., Cold Spring Harbor Laboratory, 
1980), pp. 123-154). The two DNA binding sites exhibit strong 
cooperativity of binding to DNA molecules with two laco 
sequences. A single tetramer can bind to suitably spaced 
sites on a plasmid, forming a loop of DNA between the two 
sites, and the resulting complex is stable for days (see Besse 
et al., EM BO J. 5, 1377-1381 (1986),; Flashner & Gralla, Proc. 
Natl. Acad. Scl. USA 85, 8968-8972 (1988); Hsieh et al . , J. 
Biol. Chem. 262, 14583-14591 (1987); Kramer et al . , EMBO J. 
15 6, 1481-1491 (1937); Mossing & Record, Science 233, 889-392; 
and Whitson et al . , J\ Biol. Chem. 262, 14592-14599 (1937)). 

The carboxy terminal domains of the lac repressor 
form the dimer and tetramer contacts, but significantly, 
fusions of proteins as large as /?-galactosidase can be made to 
the carboxy terminus without eliminating the DNA binding 
activity of the repressor (see Muller-Hill and Kania, Nature 
249, 561-563 (1974); and Brake et al . , Proc. Natl. Acad. Scl. 
USA 75, 4324-4327 (1978)). The lac repressor fusion proteins 
of the present invention include not only carboxy terminus 
fusions but also amino terminus fusions and peptide insertions 
in the lac repressor. Substitutions of other sequences, 
including eukaryotic nuclear localization signals, 
transcriptional activation domains, and nuclease domains, have 
been made at both the amino and carboxy termini of the lac 
repressor without serious disruption of specific DNA binding 
(see Hu and Davidson, Gene 99, 141-150 (1991); Labow et al . , 
Mol. Cell. Biol. 10, 3343-3356 (1990); and Panayotatos et al . , 
J". Biol. Chem. 264, 15066-15069 (1989)). 

The binding of the lac repressor to a single 
35 wild-type lacO is both tight and rapid, with a dissociation 
constant of 10~ 13 M, an association rate constant of 7 x 10 9 
M ~ s" 1 / and a half-life for the lac repressor-IacO complex of 
about 30 rain, (see Barkley and Bourgeois, 1980, The Operon 
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(Miller and Reznikoff, eds., Cold Spring Harbor Laboratory), 
pp. 177-22 0) . The high stability of the lac repressor-DNA 
complex has' permitted its use in methods for identifying DNA 
binding proteins (see Levens and Howley, Mol. Cell. Biol. 5, 
2307-2315 (1985) ), for quantifying PCR-amplif ied DNA (see 
Lundeberg et al . , Bio/Tech. 10, 68-75 (1991)), and for 
cleavage of the E* coli and yeast genomes at a single site 
(see Koob and Szybalski, Science 250, 271-273 (1990)). This 
stability is important for purposes of the present invention, 
because, for the affinity selection or "panning" step of the . 
screening process to succeed, the connection between the 
fusion protein and the plasmid that encodes the fusion protein 
must remain intact for at least a portion of the complexes 
throughout the panning step. 

In fact, for purposes of the present invention, a 
longer half-life is preferred. A variety of techniques can be 
used to increase the stability of the DNA binding protein-DNA 
complex. These techniques include altering the amino acid 
sequence of the DNA binding protein, altering the DNA sequence 
of the DNA binding site, increasing the number of DNA binding 
sites on the vector, adding compounds that increase the 
stability of the complex (such as lactose or ONPF for the lac 
system) , and various combinations of each of these techniques. 

An illustrative random peptide library cloning 
vector of the invention, plasmid pMC5, demonstrates some of 
these techniques. Plasmid pMC5 has two lacO sequences to take 
advantage of the strong cooperative interaction between a lac 
repressor tetraraer and two lac repressor binding sites, and 
each of these sequences is the symmetric variant of the lacO 
sequence, called lacO s or 2acO id , which has about ten fold 
higher affinity for repressor than the wild-type sequence (see 
Sadler et al . , Proc . Natl. Acad. Sci . USA 80, 6785-6789 
(1983), and Simons et al . , Proc. Natl. Acad. Sci. USA 81, 
1624-1628 (1984)). Other "tight-binding" lac repressors and 
coding sequences for those repressors that can be used for 
purposes of the present invention are described in Maurizot 
and Grebert, FEBS Lettrs . 239(1), 105-108 (1988), incorporated 
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herein by reference. See also Lehming et al . , EMBO J. 6(10), 
3145-3153 (1987) . 



of the construction of the plasmid are in Example 1, below. 
This library plasmid contains two major functional elements in 
a vector that permits replication and selection in E . call. 
The lacl gene is expressed under the control of the araB 
promoter and has a series of restriction enzyme sites at the 
3' end of the gene. Synthetic oligonucleotides cloned into 
these sites fuse the lac repressor protein coding sequence to 
additional random peptide coding sequence. 

Once a vector such as pMC5 is constructed, one need 
only clone peptide coding sequences in frame with the DNA 
binding protein coding sequences to obtain a random peptide 
library of the invention. Thus, the random peptide library of 
the. invention is constructed by cloning an oligonucleotide 
that contains the random peptide coding sequence (and any 
spacers, framework determinants, etc., as discussed below) 
into a selected cloning site of a vector that encodes a DNA 
binding protein and binding sites for that protein. 

Using known recombinant DNA ■ techniques (see 
generally, Sambrooke et ai . , Molecular Cloning, A Laboratory 
Manual, 2d ed. , Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, N.Y., 1989, incorporated herein by reference), 
one can synthesize an oligonucleotide that, inter alia, 
removes unwanted restriction sites and adds desired ones, 
reconstructs the correct portions of any sequences that have 
been removed, inserts the spacer , conserved , or framework 
residues, if any, and corrects the translation frame (if 
necessary) to produce an active fusion protein comprised of a 
DNA binding protein and random peptide. The central portion 
of the oligonucleotide will generally contain one or more 
random peptide coding sequences (variable region domain) and 
spacer or framework residues. The sequences are ultimately 
expressed as peptides (with or without spacer or framework 
residues) fused to or in the DNA binding protein. 

The variable region domain of the oligonucleotide 
encodes a key feature of the library: the random peptide. The 



Plasmid pMC5 is shown in Figs. 1 and 2, and details 
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size of the library will vary according to the number of 
variable codons, and hence the size of the peptides, that are 
desired. Generally, the library will be at least 10 s to 10 s 
or more members, although smaller libraries may be quite 
useful in some circumstances. To generate the collection of 
oligonucleotides that forms a series of codons encoding a 
random collection of amino acids and that is ultimately cloned 
into the vector, a codon motif is used, such as (NNK) X , where 
N may be A, C, G, or T (nominally eguimolar) , K is G or T 
(nominally equimolar) , and x is typically up to about 5, 6, 7, 
or a or more, thereby producing libraries of penta-, hexa-, 
hepta-, and octa-peptides or more. The third position may 
also be G or C, designated »S». Thus, NNK or NNS (i) code for 
all the amino acids, (ii) code for only one stop codon, and 
(iii) reduce the range of codon bias from 6:1 to 3:1. There 
are 32 possible codons resulting from the NNK motif: 1 for 
each of 12 amino acids, 2 for each of 5 amino acids, 3 for 
each of 3 amino acids, and only one of the three stop codons. 
With longer peptides, the size of the library that is 
generated can become a constraint in the cloning process, but 
the larger libraries can be sampled, as described below. The 
expression of peptides from randomly generated mixtures of 
oligonucleotides in recombinant vectors is discussed in 
Oliphant et al . , Gene 44, 77-183 (1986), incorporated herein 

by reference. 

An exemplified codon motif (NNK) x produces 32 
codons, one for each of 12 amino acids, two for each of five 
amino acids, three for each of three amino acids and one 
(amber) stop codon. Although this motif produces a codon 
distribution as equitable as available with standard methods 
of oligonucleotide synthesis, it results in a bias against 
peptides containing one-codon residues. For example, a 
complete collection of hexacodons contains one sequence 
encoding each peptide made up of only one-codon amino acxds, 
but contains 729 (3 5 ) sequences encoding each peptide with 
three-codon amino acids. 

An alternate approach that minimizes the bias 
against one-codon residues involves the synthesis of 20 
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activated trinucleotides, each representing the codon for one 
of the 20 genetically encoded amino acids. These 
trinucleotides are synthesized by conventional means, removed 
from the support with the base and 5-OH-protecting groups 
5 intact, and activated by the addition of 3 1 -O-phosphoramidite 
(and phosphate protection with beta-cyanoethyl groups) by the 
method used for the activation of mononucleosides , as 
generally described in McBride and Caruthers, Tetr. Letters 
22, 245 (1983), which is incorporated herein by reference. 

10 Degenerate "oligacodons" are prepared using these 

trimers as building blocks. The trimers are mixed at the 
desired molar ratios and installed in the synthesizer* The 
ratios will usually be approximately equimolar, but may be a 
controlled unequal ratio to obtain the over- to under- 

15 representation of certain amino acids coded for by the 

degenerate oligonucleotide collection. The condensation of 
the trimers to form the oligocodons is done essentially as 
described for conventional synthesis employing activated 
mononucleosides as building blocks. See generally, Atkinson 

20 and Smith, oligonucleotide Synthesis (M.J. Gait, ed.), 

pp. 35-82 (1984) . This procedure generates a population of 
oligonucleotides for cloning that is capable of encoding an 
equal distribution (or a controlled unequal distribution) of 
the passible peptide sequences. This approach may be 

25 especially useful in generating longer peptide sequences, 
because the range of bias produced by the (NNK) x motif 
increases by three-fold with each additional amino acid 
residue. 

When the codon motif is (NNK) x , as defined above, 
30 and when x equals 8, there are' 2.6 x 10 10 possible 

octapeptides. A library containing most of the octapeptides 
may be produced, but a sampling of the octapeptides may be 
more conveniently constructed by making only a subset library 
using about 0.1%, and up to as much as 1%, 5%, or 10%, of the 
35 possible sequences, which subset of recombinant vectors is 
then screened. As the library size increases, smaller 
percentages are acceptable. If desired, to extend the 
diversity of a subset library the recovered vector subset may 
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ToJ^T* " " U "* S "~ is to suhsegu.nt 

, rounds of screen™,. Thi5 .utagenesis step 

accomplished in two general ways: the variat , ls , 

recovered phage My he .utagenUed or additions . 

The process of constructing a random peptide 
orS 1 "! r" 3 — 1 ""^ 1S i" «*Mpl. 2. helov. x„ 

sL! I, y "* i» «*= -in, the half- 

dodeca».r p eptide saqu . nce , conneoted ^ ^ c _ 

». .»» would also o. ,„ acceptaol. li„ k . r) , can he specLed 
hy a degenerate oligonucleotide population containing twelve 

legated to a four fold nolar excess of , nne ,led 
Oligonucleotides yielded a test library of 5.5 x 10 = 

independent clones. 

trans, ^ ^ Const ™cted, host cells are 

transformed with the l ibrary vectors . The successful 
tran f Grinants are typicaUy ^ 

ant b L° r Und£r SSleCtiVa C ° nditi - S ' « appropriate 

Preferably ampicillin. This selection may be ^ ^ ; 

e ' qUld gr ° Wth medium - For growth on solid medium, the 

e ils are grown at a high density ( - 10 a to xo* transf onnants 
Per m on a large surface ^ ^ CQntai 

For L th " antibi ° tiC t0 f °™ essentially a confluent lawn. 
For growth m l lqui d culture, cells raay be grown in L -broth 
(wxth antibiotic selection) through about xo or more 
doublwg,. Grovth in liquid culture may be conven . ent 

because of the size of the libraries, while growth on solid 
*>edxa likely provides less chance of bias during the 
amplification process. 

For best results with the present method, one should 
control the ratio of fusion proteins to vectors so that 
vectors are saturated with fusion proteins, without a vast 
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excess of fusion protein. Too little fusion protein could 
result in vectors with free binding sites that might be filled 
by fusion protein from other- cells in the population during 
cell lysis, thus breaking the connection between the genetic 
5 information and the peptide ligand. Too much fusion protein 
could lead to titration of available receptor sites during 
panning by fusion protein molecules not bound to plasmid. To 
control this ratio, one can use any of a variety of origin of 
replication sequences to control vector number and/or an 
10 inducible promoter, such as any of the promoters selected from 
the group consisting of the araB, lambda pL, (which can be 
either nalidixic acid or heat inducible or both) , trp, lac, 
T7, T3, and tac or trc (these latter two are trp/ lac hybrids) 
promoters to control fusion protein number. A regulated 
15 promoter is also useful to limit the amount of time that the 
peptide ligands are exposed to cellular proteases. By 
inducing the promoter a short time before lysing the cells 
containing a library, one can minimize the time during which 
proteases act. 

20 The araB promoter normally drives expression of the 

enzymes of the E. coll araBAD operon, which are involved in 
the catabolism of L-arabinose. The araB promoter is regulated 
both positively and negatively, depending on the presence of 
L-arabinose in the growth medium, by the araC protein. This 
25 promoter can be catabolite repressed by adding glucose to the 
growth medium and induced by adding L-arabinose to the medium. 
Plasmid pMC5 encodes and can drive expression of the araC 
protein (see Lee, The Operon (Miller and Reznikoff, eds. , Cold 
Spring Harbor Laboratory), pp. 389-409 (1930)). The araB 
30 promoter is also regulated by the CAP protein, an activator 
involved in the E. coll system of catabolite repression. 

The expression level of the lad fusion gene under 
the control of the araB promoter in plasmid pMC5 can be 
controlled over a very wide range through changes in the 
3 5 growth medium. One can construct a vector to measure 

expression of a fusion protein encoding gene to determine the 
growth conditions needed to maintain an acceptable ratio of 
repressors to vectors. Plasmid pMC3 is such a vector and can 
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be constructed by attaching an oligonucleotide that encodes a 
short peptide linker (GADGA [SEQ ID NO: 65]) followed by 
dynorphin B (YGGFLRRQFKVVT [SEQ ID NO: 7]) to the lad gene in 
plasmid pMC5. Monoclonal antibody D32.39 binds to 
dynorphin B, a 13 amino acid opioid peptide (see Barrett and 
Goldstein, NeuropeptldQs 6, 113-120 (1895), incorporated 
herein by reference) . These same reagents, plasmids pMC3 and 
pMC5 and receptor D32.39, provide a test receptor and positive 
and negative controls for use in panning experiments, 
described below. Growth of E. coll transformants harboring 
plasmid pMC3 in LB broth (10 g of tryptone, 5 g of NaCl, and 
5 g of yeast extract per liter) allowed detection in a Western 
blot of a faint band of the expected molecular weight, while 
addition of 0.2% glucose rendered this band undetectable. 
Growth in LB plus 0.2% L-arabinose led to the production of a 
very heavy band on a stained gel, representing greater than 
25% of the total cell protein. 

To prevent overproduction of the fusion protein 
encoded by a plasmid pMC5 derivative (or any other vector of 
the present invention that has an inducible promoter) , one can 
grow the transf ormants first under non-inducing conditions (to 
minimize exposure of the fusion protein to cellular proteases 
and to minimize exposure of the cell to the possibly 
deleterious effects of the fusion protein) and then under 
"partial induction" conditions. For the ara3 promoter, 
partial induction can be achieved with as little as 3.3 x 
10" 5 % of L-arabinose (as demonstrated by increased repression 
in the assay described below) . A preferred way to achieve 
partial induction consists of growing the cells in 0.1% 
glucose until about 3 0 min. before the cells are harvested; 
then, 0.2 to 0.5% L-arabinose is added to the culture to 
induce expression of the fusion protein. Other methods to 
express the protein controllably are available. 

One can estimate the lacl expression level necessary 
to fill the available binding sites in a typical plasmid pMC5 
derivative by observing the behavior of strain AKI 2 0 (lacl" 
IacZYA + ) transformed with pMC3 or pMC5 (encoding only the 
linker peptide GADGA [ SEQ ID NO: 65]). Because the lacO sites 
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in plasmids pMC3 and pMC5 have higher affinity than those in 
the lacZYA operon, the available repressor should fill the 
plasmid sites first. Substantial repression of lacZYA should 
be observed only if there is an excess of repressor beyond the 
5 amount needed to fill the plasmid sites. As shown by color 
level on X-gal indicator plates and direct assays of 
0-galactosidase (see Miller, Experiments In Molecular Genetics 
(Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 
(1972)), incorporated herein by reference), the amount of 

10 repressor produced by pMC5 is sufficient to fill the lacO 

sites and repress greater than 200 fold lacZYA in ARI 20 host 
cells during growth in normal LB medium (2.4 units compared to 
500 units from ART 2 0 transformed with vector pBAD18, which 
has no lacl) . The repressor encoded by pMC3 was partially 

15 inactivated by the addition of the dynorphin B tail, allowing 
about 10 fold higher expression of lacZYA (37 units) . Because 
of the apparent excess production of repressor under these 
conditions, LB is a preferred medium for expressing similar 
fusion proteins of the invention. 

2 0 At some point during the growth of the 

transf ormants , the fusion protein will be expressed. Because 
the random peptide vector also contains DNA binding sites for. 
the DNA binding protein, fusion proteins will bind to the 
vectors that encode them. After these complexes form, the 

25 cells containing a library are lysed, and the complexes are 
partially purified away from cell debris. Following cell 
lysis, one should avoid cross reaction between unbound fusion 
proteins of one cell with heterologous DNA molecules of 
another cell. The presence of high concentrations of the DNA 

30 binding site for the DNA binding protein will minimize this 
type of cross reaction. Thus, for the lac system, one can 
synthesize a DNA duplex encoding the lacO or a mutated lacO 
sequence for addition to the cell lysis solution. The 
compound ONPF, as well as lactose, is known to strengthen the 

35 binding of the lac repressor to lacO, so one can also, or 

alternatively, add ONPF or lactose to the cell lysis solution 
to minimize this type of cross reaction. 
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After cell lysis, in a process called panning, 
plasmid-peptide complexes that bind specifically to 
immobilized receptors are separated from nonbinding complexes 
which are washed away. Bulk DNA can be induced during the 
lysis and panning steps to compete for non-specific binding 
sites and to lower the background of non-receptor-specific 
binding to the immobilized receptor. A. variety of washing 
procedures can be used to enrich for retention of molecules 
with desired affinity ranges. For affinity enrichment of 
desired clones, from about 10 2 to io« library equivalents (a 
library equivalent is one of each recombinant; io 4 equivalents 
of^a library of io 9 members is 10 13 vectors), but typically 
10 3 to io 4 library. equivalents, are incubated with a receotor 
(or portion thereof) for which a desired peptide ligand is 
15 desired. The receptor is in one of several forms appropriate 
for affinity enrichment schemes. In one example the receptor 
is immobilized on a surface or particle, and the library is 
then panned on the immobilized receptor generally according to 
the procedure described below. 

A second example of receptor presentation is 
receptor attached to a recognizable ligand (which may be 
attached via a spacer) . A specific example of such a ligand 
is biotin. The receptor, so modified, is incubated with the 
library, and binding occurs with both reactants in solution. 
The resulting complexes are then bound to streptavidin (or 
avidin) through the biotin moiety. See PCT patent publication 
No. 91/07087. The streptavidin may be immobilized on a 
surface such as a plastic plate or on particles, in which case 
the complexes (vector/DNA binding protein/peptide/receptor/ 
30 biotin/streptavidin) are physically retained; or the 

streptavidin may be labelled, with a fluorophore, for example, 
to tag the active fusion protein for detection and/or 
isolation by sorting procedures, e.g., on a fluorescence- 
activated cell sorter. 

Vectors that express peptides without the desired 
specificity are removed by washing. The degree and stringency 
of washing required will be determined for each 
receptor/peptide of interest. A certain degree of control can 
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be exerted over the binding characteristics of the peptides 
recovered by adjusting the conditions of the binding 
incubation and the subsequent washing. The temperature, p H 
ionic strength, divalent cation concentration, and the volume 
and duration of the washing will select for peptides within 
particular ranges of affinity for the receptor. Selection 
based on slow dissociation rate, which is usually predictive 
of high affinity, is the most practical route. This may be 
done either by continued incubation in the presence of a 
saturating amount of free ligand, or by increasing the volume, 
number, and length of the washes. m each case, the rebinding 
of dissociated peptide-vector is prevented, and with 
increasing time, peptide-vectors of higher and higher affinity 
are recovered. Additional modifications of the binding and 
washing procedures may be applied to find peptides that bind 
receptors under special conditions. 

Although the screening method is highly specific, 
the procedure generally does not discriminate between peptides 
of modest affinity (micromolar dissociation constants) and 
those of high affinity (nanomolar dissociation constants or 
greater) . The ability to select peptides with relatively low 
affinity may be the result of multivalent interaction between 
a vector/ fusion protein complex and a receptor. For instance, 
when the receptor is an IgG antibody, each complex may bind to 
25 more than one antibody binding site, either by a single 

complex binding through the multiple peptides displayed to 
both sites of a single IgG molecule or by forming a network of 
complex-IgG. Multivalent interaction produces a high avidity 
and tenacious adherence of the vector during washing. 
Multivalent interactions can be mimicked by using a high 
density of immobilized monovalent receptor. 

To enrich for the highest affinity peptide ligands, 
a substantially monovalent interaction between vector and the 
receptor (typically immobilized on a solid phase) may be 
appropriate. The screening (selection) with substantially 
monovalent interaction can be repeated as part of additional 
rounds of amplification and selection of vectors. Monovalent 
interactions may be achieved by employing low concentrations 
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of receptor, such as the Fab binding fragment of an antibody 
molecule. 

A strategy employing a combination of conditions 
favoring multivalent or monovalent interactions can be used to 
5 advantage in producing new peptide ligands for receptor 

molecules. By conducting the first rounds of screening under 
conditions to promote multivalent interactions, one can then 
use high stringency washing to reduce greatly the background 
of non-specifically bound vectors. This high avidity step may 
10 select a large pool of peptides with a wide range of 

affinities, including those with relatively low affinity. 
Subsequent screening under conditions favoring increasingly 
monovalent interactions and isolation of plasmid complexes 
based on a slow dissociation rate may then allow the 
15 identification of the highest affinity peptides. 

After washing the receptor-fusion protein-vector 
complexes to select for peptides of the desired affinity, the 
vector DNA is then released from bound complexes by, for 
example, treatment with high salt or extraction with phenol, 
2 0 or both. For the lac system, one can use IPTG, a compound 
known to decrease the stability of the lac repressor-IacO 
complex, to dissociate the plasmid from the fusion protein. 
In a preferred embodiment, the elution buffer is composed of 
1 ituM IPTG, 10 Mg/ml of a double-stranded oligonucleotide that 
25 contains lacOs, and 0 . 2 M KC1 . Once released from bound 
complexes, the plasmids are reintroduced into E. coli by 
transformation. Because of the high efficiency, the preferred 
method of transformation is electroporation . Using this new 
population of transf ormants , one can repeat additional cycles 
of panning to increase the proportion of peptides in the 
population that are specific for the receptor. The structure 
of the binding peptides can then be determined by sequencing 
the 3' region of the lad fusion gene. 

As noted above, antibody D32.39 and the pMC3 complex 
serves as a receptor-ligand positive control in panning 
experiments to determine ability to recover plasmids based on 
the sequence of the fusion peptide. Useful negative controls 
are pMC5, which encodes only the linker fusion peptide (GADGA 
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[SEQ ID NO: 65]), and pMCl, which encodes the dynorphin 3 
peptide, but lacks the lacO sequences carried by pMC3 and 
pMC5. Lysates of E. coli strains carrying each plasmid were 
panned on D32.39 immobilized on polystyrene petri dishes. 
5 After washing, plasmids were recovered from complexes bound to 
the plates by phenol extraction, followed by transformation of 
E, coll. 

The results with pure lysates demonstrated about 100 
fold more transf ormants recovered from pMC3 lysates as 
10 compared to the negative controls. The results with mixed 

lysates revealed enrichment of pMC3 versus controls among the 
population of recovered plasmids. The results with cells that 
were mixed before lysis yielded similar results. These 
results show that the plasmid-lacl-peptide complexes were 
15 sufficiently stable to allow enrichment of plasmids on the 
basis of the peptide the plasmids encode. 

The random dodecapeptide library in pMC5 described 
above was used in the screening method of the invention to 
identify vectors that encode a fusion protein that comprised a 
20 peptide that would bind to D32.39 antibody coupled to sheep 
antimouse antibody coated magnetic beads. The number of 
complexes added to the beads at each round of panning yielded 
the equivalent of 10 10 to 10 11 transf ormants (see Example 3) . 
After panning, the recovered plasmids yielded transf ormants 
25 ranging in number from about 10 s in early rounds to almost 

10 11 in the fourth and final round. Compared to the number of 
transf ormants from antibody panned complexes, panning against 
unmodified polystyrene beads produced orders of magnitude 
fewer transf ormants . 
30 The above results demonstrate that the DNA binding 

activity of lac repressor can act as a link between random 
peptides and the genetic material encoding them and so serve 
as the base on which to construct large peptide ligand 
libraries that can be efficiently screened. In the screening 
35 process, plasmid-repressor-peptide complexes are isolated by 
panning on immobilized receptor, the plasmids are amplified 
after transformation of E. coli, and the procedure is repeated 
to enrich for plasmids encoding peptides specific for the 
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receptor. The repressor binds to the library plasmid with 
sufficient avidity to allow panning of the library on 
immobilized receptor without problematic levels of 
dissociation. This system can be used to identify a series of 
related peptides that bind to a monoclonal antibody whose 
epitope has not been characterized and to identify peptide 
ligands for other receptors. 

Once a peptide ligand of interest has been 
identified, a variety of techniques can be used to diversify a 
peptide library to construct ligands with improved properties. 
In one approach, the positive vectors (those identified in an 
early round of panning) are sequenced to determine the 
identity of the active peptides. Oligonucleotides are then 
synthesized based an these peptide sequences, employing all 
bases at each step at concentrations designed to produce 
slight variations of the primary oligonucleotide sequences. 
This mixture of (slightly) degenerate oligonucleotides is then 
cloned into the random peptide library expression vector as 
described herein. This method produces systematic, controlled 
variations of the starting peptide sequences but requires, 
however, that individual positive vectors be sequenced before 
mutagenesis. This method is useful for expanding the 
diversity of small numbers of recovered vectors. 

Another technique for diversifying a selected 
peptide involves the subtle misincorporation of nucleotide 
changes in the coding sequence for the peptide through the use 
of the polymerase chain reaction (PCR) under low fidelity 
conditions. A protocol described in Leung et al . , Technique 
1/ 11-15 (1989) , utilizes altered ratios of nucleotides and 
the addition of manganese ions to produce a 2% mutation 
frequency. 

Yet another approach for diversifying a selected 
random peptide vector involves the mutagenesis of a pool, or 
subset, of recovered vectors. Recombinant host cells 
transformed with vectors recovered from panning are pooled and 
isolated. The vector DNA is mutagenized by treating the cells 
with, e.g., nitrous acid, formic acid, hydrazine, or by use of 
a mutator strain as described below. These treatments produce 
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a variety of mutations in the vector DNA. The segment 
containing the sequence encoding the variable peptide can 
optionally be isolated by cutting with restriction nuclease (s) 
specie for sites flanking the variable region and then 
recloned into undamaged vector DNA. Alternatively the 
mutagenized vectors can be used without recloning of the 
mutagenized random peptide coding sequence. 

In the second general approach for diversifying a 
set of peptide ligands, that of adding additional amino acids 
to a peptide or peptides found to be active, a variety of 
methods are available. m one, the sequences of peptides 
selected in early panning are determined individually and new 
oligonucleotides, incorporating all or part of the determined 
sequence and an adjoining degenerate sequence, are 
15 synthesized. These are then cloned to produce a secondary 
library. 

In another approach that adds a second variable 
region to a pool of random peptide expression vectors, a 
restriction site is installed next to the primary variable 
region. Preferably, the enzyme should cut outside of its 
recognition sequence, such as BspMI, which cuts leaving a four 
base 5' overhang, four bases to the 3' side of the recognition 
site. Thus, the recognition site may be placed four bases 
from the primary degenerate region. To insert a second 
variable region, a degenerately synthesized oligonucleotide is 
then ligated into this site to produce a second variable 
region juxtaposed to the primary variable region. This 
secondary library is then amplified and screened as before. 

While in some instances it may be appropriate to 
synthesize peptides having contiguous variable regions to bind 
certain receptors, in other cases it may be desirable to 
provide peptides having two or more regions of diversity 
separated by spacer residues. For example, the variable 
regions may be separated by spacers that allow the diversity 
domains of the peptides to be presented to the receptor in 
different ways. The distance between variable regions may be 
as little as one residue or as many as five to ten to up to 
about 100 residues. For probing a large binding site, one may 
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construct variable regions separated by a spacer containing 20 
. to 30 ammo acids. The number of spacer residues, when 

present, will preferably be at least two to three or more but 
usually will be less than eight to ten. An oligonucleotide 
library having variable domains separated by spacers can be 
represented by the formula: (NNK) y- (abc) n — (NNK) 2 , where N and 
K are as defined previously (note that S as defined previously 
may be substituted for K) ; y + z is equal to about 5, 6, 7, a, 
or more; a, b and c represent the same or different 
nucleotides comprising a codon encoding spacer amino acids; 
and n is up to about 20 to 3 0 codons or more. 

The spacer residues may be somewhat flexible, 
comprising oligoglycine, for example, to provide the diversity 
domains of the library with the ability to interact with sites 
m a large binding site relatively unconstrained by attachment 
to the DMA binding protein. Rigid spacers, such as, e.g., 
oligoproline, may also be inserted separately or in 
combination with other spacers, including glycine residues. 
The variable domains can be close to one another with a spacer 
serving to orient the one variable domain with respect to the 
other, such as by employing a turn between- the two sequences, 
as might be provided by a spacer of the sequence Gly-Pro-Gly, 
for example. To add stability to such a turn, it may be 
desirable or necessary to add Cys residues at either or both 
ends of each variable region. The Cys residues would then 
form disulfide bridges to hold the variable regions together 
in a loop, and in this fashion may also serve to mimic a 
cyclic peptide. of course, those skilled in the art will 
appreciate that various other types of covalent linkages for 
cyclization may also be accomplished. 

The spacer residues described above can also be 
encoded on either or both ends of the variable nucleotide 
region. For instance, a cyclic peptide coding sequence can be 
made without an intervening spacer by having a Cys codon on 
both ends of the random peptide coding sequence. As above, 
flexible spacers, e.g., oligoglycine, may facilitate 
interaction of the random peptide with the selected receptors. 
Alternatively, rigid spacers may allow the peptide to be 
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presented as if on the end of a rigid arm, where the number of 
residues, e.g., Pro, determines not only the length of the arm 
but also the direction for the arm in which the peptide is 
oriented, Hydrophilic spacers, made up of charged and/or 
uncharged hydrophilic amino acids, (e.g., Thr, His, Asn, Gin, 
Arg, Glu, Asp, Met, Lys, etc.), or hydrophobic spacers made up 
of hydrophobic amino acids (e.g., Phe, Leu, lie, Gly, Val, 
Ala, etc.) may be used to present the peptides to binding 
sites with a variety of local environments. 

The present invention can be used to construct 
improved spacer molecules. For example, one can construct a 
random peptide library that encodes a DNA binding protein, 
such as the lac repressor or a cysteine depleted lac repressor 
(described below) , a random peptide of formula NNK 5 (sequences 
up to and including NNX 10 or NNK 1S could also be used) , and a 
peptide ligand of known specificity. One would then screen 
the library for improved binding of the peptide ligand to the 
receptor specific for the ligand using the method of the 
present invention; fusion proteins that exhibit improved 
specificity would be isolated together with the vector that 
encodes them, and the vector would be sequenced to determine 
the structure of the spacer responsible for the improved . 
binding . 

Unless modified during or after synthesis by the 
translation machinery, recombinant peptide libraries consist 
of sequences of the 20 normal L-araino acids. While the 
available structural diversity for such a library is large, 
additional diversity can be introduced by a variety of means, 
such as chemical modifications of the amino acids. For 
example, as one source of added diversity a peptide library of 
the invention can be subjected to carboxy terminal amidation. 
Carboxy terminal amidation is necessary to the activity of 
many naturally occurring bioactive peptides. This 
modification occurs in vivo through cleavage of the N-C bond 
of a carboxy terminal Gly residue in a two-step reaction 
catalyzed by the enzymes peptidylglycine alpha-amidation 
monooxygenase (PAM) and hydroxyglycine aminotransferase 
(HGAT) . See Eipper et ai., J . Biol. Chem . 266, 7827-7833 
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(1991); Mizuno at al • , Bioahem. Blophys . Res . Com*. 137(3) 
984-991 (1936); Murthy et al . , j>. Biol. Che* . 261(4), 
1815-1322 (1986); Katopodis et al . f Biochemistry 29/5115-6120 
(1990); and Young and Tamburini, J. Am. Chem. Soc m 
1933-1934 (1989), each of which are incorporated herein by 
reference. 

Amidation can be performed by treatment with 
enzymes, such as PAM and HGAT, in vivo or in vitro, and under 
conditions conducive to maintaining the structural integrity 
ox the fusion- protein/vector complex-. in a random peptide 
library of the present invention, amidation will occur on a 
library subset, i.e., those peptides having a carboxy terminal 
Gly. A library of peptides designed for amidation can be 
constructed by introducing a Gly codon at the end of the 
variable region domain of the library. After amidation, an 
enriched library serves as a particularly efficient source of 
ligands for receptors that preferentially bind amidated 
peptides. Many of the C-tarminus amidated bioactive pentides 
are processed from larger pro-hormones, where the amidated 
peptide is flanked at its C-terminus by the sequence 
-Gly-Lys-Arg-X . . . [SEQ ID NO: 67] (where X is any amino 
acid) . Oligonucleotides encoding the sequence 

-Gly-Lys-Arg-X-Stop [SEQ ID NO:67] can be placed at the 3' end 
of the variable oligonucleotide region. when expressed, the 
Gly-Lys-Arg-X [SEQ ID NO: 67] is removed by in vivo or in vitro 
enzymatic treatment, and the peptide library is carboxy 
terminal amidated as described above. 

Conditions for c-terminal amidation of the libraries 
of the invention were developed using a model system that 
employed an antibody specific for the amidated C-terminus of 
the peptide cholecystokinin (CCK) . The reaction conditions to 
make the peptide a-amidating monooxygenase (PAM) enzyme active 
when used to amidate the libraries were developed using an 

I labeled small peptide substrate and an ELISA with a 
positive control glycine extended CCK octamer peptide fused to 
the lac repressor. The Z. coli strain used in the experiment 
carried plasmid P JS129, which encodes the cysteine free lac 
repressor (described below) fused to the CCK substrate peptide 
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(DYMGWMDFG) . A panning lysate was made from this strain using 
the standard panning protocol (see Example 6) .. After 
concentrating the column fractions in a Centriprep loo, four 
samples were prepared, each containing 0.25 ml of l ysate and 
0.25 ml of 2x PAM buffer (prepared by mixing 0.2 ml of i M 
HEPES, p H 7.4, 0.9 a i of 20% i actose; 3 . 6 - ^ Qf 
0.1 ml of a solution composed of 20 mg/ml catalase, i 5 [ 6 ul of 
6 M Nal, and 150 ^1 of o.i M ascorbic acid) . pam enzyme was 
added to the tubes and incubated at 3 7-C for 3 0 minutes 
Then, 120 M l of 5% BSA in HEKL buffer and 6 M l of herring DNA 
were added to each tube; the contents of each tube were then 
added to 6 microtiter wells that had been coated with 
2 /xg/well anti-GCK antibody and blocked with BSA The 
microtiter plate was agitated at 4-C for 150 minutes, washed 
5x with cold HEKL, washed for 10 minutes with a solution 
composed of HEKL, 1% BSA, and 0 . 1 mg/ml herring DNA, and 
washed again 5x with cold HEKL. The plasmids were eluted 
using the standard protocol and used to transform E. aoli host 
cells. The results showed a dramatic increase in the recovery 
of plasmid transformants with increasing amounts of PAM 
enzyme, demonstrating that the amidation reaction worked. 

Other modifications found in naturally occurring 
peptides and proteins can be introduced into the libraries to 
provide additional diversity and to contribute to a desired 
biological activity. For example, the variable region library 
can be provided with codons that code for amino acid residues 
involved in phosphorylation, glycosylation , sulfation, 
isoprenylation (or the addition of other lipids), etc. 
Modifications not catalyzed by naturally occurring enzymes can 
be introduced by chemical means (under relatively mild 
conditions) or through the action of, e.g., catalytic 
antibodies and the like. in most cases, an efficient strategy 
for library construction involves specifying the enzyme (or 
chemical) substrate recognition site within or adjacent to the 
variable nucleotide region of the library so that most members 
of the library are modified. The substrate recognition site 
added can be simply a single residue (e.g., serine for 
Phosphorylation) or a complex consensus sequence, as desired. 
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^wxpu&e. xne method involvpc: i«-h>-^„ ■ 
nucleotide se^ences ^ coda - -inn 

rescues into or aajacent to vari nuc ,!!^ ■ 
- to contribute to the d . 3i „ a p . ptiae ^~ 1 - 
nonessential to the structure are ,u« to 

* regenerate peptide library as described herein „ 
ncorp orate the conserved f r„e»or*s to produce a^d.or 

' Several t a B ili.s of bioa ctive 

Peptides are related by , secondary structure that results in 
• conserved ■• fra.ewor* , which in so„. cases is a I , 

I;::, b r, :is i 1 :-:;^ 0 ?" variabi * r " id ^ * ■ — 

Y a disulfide bond, as discussed above. 

In some cases, a more complex framework that 

o- a peptide family. An exa mple of this class i <= *h 
ccnotcins, peptide to.in, oe I0 to 30 ^ "J^" ^ 

by venomous .oUuscs k „o„„ « predatory con, snails. The 
con o p . ptiaes genersUy ^^^^ a 

1 1 b ; : n rr s : inkin9 - or ■« »*iy L-i^. 

primary 1 T 9 " UI,S ' ° U " d h ™ ~"""ed 

primary frameworks as follows (c is Cys) : 

mu cc , . . c 



varies f ^ * "«* P ^ ° f 

IhaLs I" 6 ^ P£PtideS rSPOrted tQ d — The side 

chains of the residues that flank the Cys residues are 

apparently not conserved in peptides with different 

specificity, as in peptides from different species with 

IZolLT idSntiCal S?eCificities - the conotoxins have 

exploited a conserved, densely crosslink motif 
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framework for hypervariable regions to produce a huge array of 
peptides with many different pharmacological effects. 

The am and omega classes (with 6 Cys residues) have 
15 possible combinations of disulfide bonds. Usually only one 
of these conformations is the active ("correct") form The 
correct folding of the peptides may be directed by a conserved 
40 residue peptide that is cleaved from the N-terminus of the 
conopeptide to produce the small, mature bioactive peptides 
that appear in the venom. 

With 2 to 6 variable residues between each pair of 
Cys residues, there are 125 (5^) possible framework 
arrangements for the mu class (2,2,2, to 6,6,6), and 625 ( 5 *) 
possible for the omega class (2,2,2,2 to 5,6,6,6). 
Randomizing the identity of the residues within each framework 
produces io" to >10^° peptides. "Cono-like" peptide 
libraries are constructed having a conserved disulfide 
framework, varied numbers of residues in each hypervariable 
region, and varied identity of those residues. Thus, a 
sequence for the structural framework for use in the 'present 
invention comprises Cys-Cys-Y-Cys-y-cys-Cys, or 
Cys-Y-Cys-Y-Cys-Cys-Y-Cys-Y-Cys, where Y Is'(NNK) x or (NNS) X ; 
N is A, c, G or T; K is G or T; S is G or c; and x is from 2 
to 6. 

Framework structures that require the formation of 
one or more disulfide bonds under oxidizing conditions may 
create problems with respect to the natural lac repressor] 
which has 3 cysteine residues. All 3 of these residues, 
however, can be changed to other amino acids without a serious 
effect on the function of the molecule (see Kleina and Miller, 
30 supra) . Plasmid pjsi23 is derived from plasmid pMC5 by site 
specific mutagenesis and encodes a lac repressor identical to 
the lac repressor encoded on plasmid pMC5, except the cysteine 
codon at position 107 has been changed to an serine codon; the 
cysteine codon at position 140 has been changed to an alanine 
codon (alanine works better than serine at this position) ; and 
the cysteine codon at position 281 has been changed to a 
serine codon. Plasmid pjsi23 (available in strain ARI 161 
from the American Type Culture Collection under the accession 
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number ATCC No. 68 819) is therefore preferred * 

random peotide n ■ Preferred for constructing 

structures lnV ° 1Vlng <""**-U**« ^amewor* 

5 can h. J" 1 " repreSS ° r Codin ^ se «3^nce in plasmid dJS123 

5 can be sub D ected to mutagenesis to improve the bindi 7 

mutant coding sequence with laco type se e l s T ° f ^ 
method for nerf ormi n„ ^- sequences. A preferred 

ror performing this mutagenesis involves the 
construction of , coding sequence 

specifLC The ^ P£Ptide ' 3 PePtide ° f *»~» 

by anJ of I' result -9 vector is subjected to mutagenesis 

^nsfol at . Variety ° f meth0dS '- 3 -thod involves 

per, Free. Natl. Acad. Sci. USA 85, 8126-8130 (1988) 
- incorporated herein by reference, and culture of thi ' 

rT f toprote the fusion pr ° tein ™ 

*ethoT to T T S10n Pr ° teinS SCreenSd ^ the P"««nt 

LndLVa^f / SCt0rS that WUtatad to 

- Leo LguL;; y o ° f CYSteine daPleted lessor to the 

otlonZZln C ° Uld COInbine thiS meth ° d Wlth ths "thed 

of constructing improved spacers, described above, to select 

molecule™' ^ — sor-peptide spacer 
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create m , " faShi ° n ' PJS123 Was -^^ied to 

ZlTtlr lZ PJS123 ' WhlCh then intr ° dUCed int ° * 
mutator strain. Oligonucleotides were then cloned into the 

-»t agenized vector to encode a 032.3, epitope joined to 

ThL III! 13 3 regi ° n ° f 5 ' - » -ino acids. 

j lb 7 Panned °» D32 . 39 antibody for 5 rounds under 

select StringSnt C ° nditi0nS - ^dividual clones were 

selected from the population of plasmids surviving after the 

LcilT d and tested by 3 variety of assa ^ s - Th «- »««y« 

included: (1) tests for ability to repress the chromosomal 
lac operon (a test of dna binding affinity); (2) tes ts for 
Plasmid copy number; ( 3 , ELISA with D32. 39 antibodv to test 

III 7 ^ ^ PSPtide SPitOPe; and ^ tests ot Plasmid 

recovery during panning. several of these plasmids were 

sequenced in the random tail region to determine the structure 
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of the linker peptide. A series of subcloning experiments 
were also conducted to determine regions of the plasmids that 
determined the observable properties of the plasmids 
Finally, plasmids carrying a higher copy number replication ' 
origin and encoding one of the linker regions were constructed 
and sequenced to ascertain that no base changes in the 
cysteine free repressor gene, as compared to the starting 
plasmid, were introduced. The linker tail of this plasmid and 
the cloning strategy for random libraries is shown in Fig 4 
Two versions of the vector were constructed, one with the 
cysteme-free lac repressor gene (ARI246/pJSl41; ATCC 
No. 69088) and one with the wild-type lao repressor gene 
(ARI280/PJS142; ATCC No. 69087). (These cell lines will be 
maintained at an authorized depository and replaced in the 
event of mutation, nonviability or destruction for a period of 
at least five years after the most recent request for release 
of a sample was received by the depository, for a period of at 
least thirty years after the date of the deoosit, or during 
the enforceable life of the related patent, whichever oeriod 
20 is longest. All restrictions on the availability to the 

public of these cell l ines will be irrevocably removed uoon 
the issuance of a patent from the above-captioned 
application.) 

ARI246 has the genotype E. call 3 lon-ll suIAl 
25 hsdRll t(ompV-f spC ) A cIpA319: :* M lacI42::Tnl0 lacZUHS. The 
lon-ll, A(ompT-fepC), and ac! P A319: zkan mutations destrov 
three genes involved in proteolysis, so this strain should 
allow greater diversity of peptides to be expressed on the 
library particles. The suIAl mutation suppresses the 
filamentation phenotype caused by the lon-ll allele. The 
hsdRll mutation destroys the restriction system to allow more 
efficient transformation of unmodified DKA. The lacI42::TnlO 
mutation eliminates expression of the chromosomal lac 
repressor gene to prevent competition of wild-type repressor 
for binding sites on the library plasmids. The lacZUHS 
allele stops expression of 6-galactosidase, which would . 
otherwise be constitutive in the IacI42::TnlO background, 
leading to unnecessary use of cell resources and reducing 
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ZZT^T:i\f- COIi 3 cells m raore quic ^ «■« 

cells and y le ids excellent electrocompetent cells for 
transforation. Transformation frequencies of around 5 x 
10 tf /Mg of Bluescript plasmid DNA can be frequently 

IT 6 ' ^ ARI246 CeUS - ARI23 ° haS the — g^otyp. as 
AH 246, except that the lad station has been convert^ to a 
delete by selecting for loss of the T. 1Q insertion anda 
««t:;«t station has been introduced. The recA.cat * 

10 1S As USSfUl t0 Pr£Vent h ° m0l0g0US "^i-^an between 

I 1 ' ' AS 3 C ° nSe ^ ence ' ^ library plasmids exist more 

bUL / S * >0n ° merS ' " ther *»" -""Wic forms that can 
be observed xn ARI246. The .oncers are better for two 

P^icle r£dUCe ValenCy ° f PeptidSS *~ "*«Y 

15 CfL , Wln9 KOr9 Stri ^ ent -action for higher 

nZlTl/V li9andS '' ^ " m — S in ™ ^ e 

llbrarv " ^ ° f ™ A ' inc «« ^ th. number of 

L CA Lt T. ntS that bS a ^ — Ptors. The 

rllt 1SSS health ^' « growth 

20 to abou'V to tr » 8f «»»tion frequency is reduced 

" u to about 2 x 10 10 tf/^tg. 

Other changes can be introduced to provide residues 
tha contribute to the peptide structure, around which the 
viable a ffll no acids are encoded by the library members. For 

25 Z7 !' thSSe reSidUSS " n Pr ° Vide f ° r al ? ha b-li"-. a 

helix -turn-helix structure, four helix bundles, a beta-sneet, 

other secondary or tertiary structural (framework or 
s=af folding) motifs . ses us _ SNf Q7/7la577i f . ied 

1991, incorporated herein by reference. DNA binding peptides 
such as those that correspond to the transcriptional 
transactivators referred to as leucine zippers, can also be 
used as a framework, provided the DNA binding peptide is 
distinct from the DNA binding protein component of the fusion 
Protean and the library vector does not contain the binding 
sxte for the DNA binding peptide. m these peptides, leucine 
re Slaues are repeated every seven residues in the motifs, and 
the region is adjacent to an alpha helical region rich in 
lysines and arginines and characterized by a conserved helical 
race and a variable helical face. 
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Other specialized forms of structure ™ • 
can also h e used ^ structural constraints 

*e used in the present invention. For exammo 

Peptide „r ° n ° ther aSPe " t0 £ ™"™>rk s for a 

™ ^Tr- ^or-ation fro. the structu „ of 

oo^r:::^::™ ai rr 9iy - - — 
" ^r«rr parentai p ' ptide u9and - Thi - 

„. U1 f ° r »««»i"9 for any rec.ptor-ligand 

thi Th , 1 " terl « ul<1 ^. inauiin-U*. growth factor, .to !„ 

« a rr ° f i " V6 " tl °"' "» "orary cJ 

=ont M „ „ to ioq d . rterent _ y 

ot 1000 °* »°" T« „ill generally be u *.„ 

Peptide n,al S : T P " Sent in "' nU °" ™" be »»•<• <~ construct 
pr«e« ed ITd. dtVersity - 1 "«ures of the 

pL^IH e " b0dl "' nt ° f the invention, called ..peptide, on 

protSn /" " 0 " " "» 

dist^ct fr« the. of the previously Scribed phage 

he "" d ™ PePtid " ° £ «» Resent """".s can 

^ displayed „th a free carboy t.„i„ u s instead of being 

!van b, ^ " diVa " lty t0 lh - «-ct U res 
available for receptor binding. The presentation of peotide 
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ligands at the carboxy terminus also facilitate •„ • 
^cussed above. This mode of di^^t^T^ 0 :; " 
codcns „ the degenerate region, which occur .orTo.tL in ? 
longer degenerate oligonucleotides, shorten rather San 
5 destroy individual clones. The presence of s t 

random peptide coding seguence ^L^,^^ ^ 
additional diversity, by creating pep J^™* 
lengths. The l ac repressor fusions of the invent" , 

cvtonl additi ° n ' these lac repressor fusions are 

ex^teTr t P r einS ' UnUkS ^ fUSi0 " S ' «. 
TnLellt^ Peri r aS " ° f b ° th fUSi ° n ™ 

Ypes of libraries are exposed to different cellular 
cedents * nd so »« -posed to different sets Q f * 7 • 
There ls no need, however, f 0r peptides fused 

™™i c T tibis with the protsin — -™ 

na rne formation of an intact phage coat Th 0 ~ 

si» P i, *. =o, PltiUe „ ith the fo p r j ti :r: f need 

^"" r h diMr ' " hi = h " <*• *or» of th. p rot.i„ ■ 

« r;r:,^ ~ - — — - 

.- 4S " "* Phage SySCem ' the lac repressor fusion 

library Splays „ lttpX . copi „ of "» 

library partrci.. Est: „ represa „ t . 

P ' PUd " '»» ~» .V.U-U. for bLai^' 

l d ! " Play all °" S the i»=l»tion o, lig a„ds with 

-operate affinity ( „i cromol « K,. see cvirla , tl *., supra, . 

»oajrt! Pt ,T kn ° Wn ' " 9h aftinlt " - these 

slTv b y li9anas °™ ° bscure the »« 

ovsrco-e by lml obiiUi ng r , ceptors ^ 
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^ af "" ity '-"""I" *„> Ugands to ne 

itH 1 ' V'*™ «»»■ *« —Ptors whose normal 

display vm be an advantage for identifying initial f,„i, ( 
additional rounds of screening under monovalent cLl * 
isolation of peptides with a wide range of affinities, 

Libraries of peptides produced and screened 
according to the present invention are particularly useful for 

1= advant P ° ten " i ' 1 ePltOP " " deSCribed """" »- <*^r 

usTanTr ""I "* th ° dS b " ed is now in 

uSul ( " 87) ' r " °>— Varies ,r. 

useful ln providing „e„ lig , nds Ior inportant bi „ a 

*0 ell "' SUCh " h °" 0ne ""P'O". «*«i=n molecules, 
20 enzymes, and the like. 

the SCT - PrSSent llbrarieS Can be generalized to allow 

the screening of a wide variety of peptide and protein 
ixgands. In addition, the vectors are constructed so that 

5 FoTeTY* 0thSr li?andS enC ° ded ^ ^ Pla - id is P°«"*.. 
For example, the system can be simply modified to allow 

screening of RNA i igands . A known ^ fa ^ 

- rxbosomal protein) is fused to the DNA binding protein. A 

ZZIT elS£Where ° n VeCt ° r driVSS i«» of an RNA 

-olecul. composed of the known binding site for the RNA 

binding protein followed by random sequence. The DNA-RNA 
bxndxng fusion protein would link the genetic information of 
the vector with each member of a library of RNA Iigands. 
These RNA l igands could then fae ^ 

techniques . 

Another large class of possible extensions to this 
technique is to use a modified version of the vector to 
isolate genes whose products modify peptides, proteins, or RNA 
m a desired fashion. This requires the availability of a 



10 



15 



WO 96/40987 

PCT/US96/09809 

40 

receptor that binds specifically to the modified product. For 
the general case, a connection is made between the plasmid and 
the substrate peptide, protein, or RNA, as described above 
The plasmid is then used as a cloning vector to make libraries 
of DNA or cDNA fro* a source with the potential to contain the 
desired modification gene (specific organisms, PC R amplified 
antibody genes, etc.) under the control of a promoter that 
functions in E. coll. Plasmids carrying the gene in question 
could then be isolated by panning lysates of the library with 
the receptor specific for the modified product. 

For example, a gene encoding an enzyme that cleaves 
a particular amino acid sequence could be isolated from 
libraries of DNA from organisms that might have- such a 
protease or from amplified antibody cDNA. An antibody for use 
as the receptor would first be made to the peptide that would 
remain after the desired cleavage reaction had taken place. 
Many such antibodies will not bind to that peptide unless it 
is exposed at the N- or C-terminus of the protein. The coding 
sequence for the uncleaved substrate sequence would be 
attached to the DNA binding protein coding sequence in a 
vector. This vector would be used to make an expression 
library from an appropriate source. Members of this library 
containing a gene that encoded an enzyme able to cleave the 
peptide would cleave only the peptide attached to the plasmid 
25 with that gene. Panning of lysates of the library would 
preferentially isolate those plasmids with active genes. 

Selection of dwa Binding Prni-p jns bv Forced Evnln-M rm 

Although some DNA binding proteins for use in the 
invention are obtained directly from the repertoire of natural 
DNA binding. proteins, other DNA binding proteins are selected 
iy a process termed forced evolution. Forced evolution 
selects a DNA binding protein optimal for use in the peptides 
on plasmid screening methods described elsewhere in the 
35 specification. The functional properties that allow a DNA 
binding protein to survive the forced evolution process are 
the very same properties that confer optimum capacity to 
screen peptides in the peptides on plasmids method. Thus, the 
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forced evolution process does not require prior knowledge of 
the binding mechanism of a DNA binding protein or prior 
definition of criteria (e.g.., dissociation half-life, 
conformation) on which the efficacy of a DNA binding 'protein 
in the peptides on plasmids method is founded. 

The method for performing forced evolution of a DNA 
binding protein is closely analogous to the methods of 
screening peptides on plasmids. The main difference between 
screening peptides on plasmids and forced evolution lies in 
whether the peptide component or the DNA binding component of 
a fusion protein is varied in different members of a library. 
In the peptides on plasmid method, the DNA binding protein is 
constant and the peptide moiety varies in different members of 
the library. The methods select a peptide with specific 
15 affinity for a receptor. 

In the forced evolution method, the peptide is 
constant between different members of a library, and the DNA 
binding protein varies between members. As in the peptides on 
plasmids method, cells are transformed with libraries of 
vectors encoding fusion proteins. The fusion proteins 
comprise a potential DNA binding protein fused to the constant 
peptide. The cells are cultured under conditions in which the 
fusion proteins are expressed. If a fusion protein comprises 
a potential DNA binding protein that in fact has an affinity 
for the vector encoding it, the fusion protein binds to the 
vector to form a complex. The cell are lysed releasing 
complexes. 

Complexes are screened by affinity purification on a 
receptor known to bind the peptide present in all of the 

30 complexes. Vectors are purified from complexes binding to the 
receptor via the peptide, amplified (e.g., by retransf ormation 
or PCR) and sequenced to reveal the identity of DNA binding 
proteins that have survived the selection process. To have 
survived the selection process, a DNA binding protein must 

35 have two properties: (i) capacity to remain complexed with the 
vector encoding it throughout the screening process; and 
(2) capacity to display the peptide with a conformation 
suitable for interaction with its receptor. These are the 
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^properties that maxe a DNA binding protein useful for 
displaying a peptide in ^ peptides on 

U) S ° UrCeS nf P"tenti a T n Wa n<- iinn . r , 

The oligonucleotides encoding the potential DNA 
binding proteins can derive fro, a number of sources. oLn 
one starts with a natural DNA binding protein, in which cas ' 
the_ dl fferent potential DNA binding proteins represent 
variants of the natural DNA binding protein. Variants of a 
natural DNA binding protein can be produced b y pch mutagenesis 
of a DNA sequence encoding the protein. PCK mutagenesis can 
result xn a low rate of mutagenesis at any position of the 
coding sequence. Th us, typically, each potential DNA binding 
protein shows a high degree of sequence identity (e.g at 
least 95 or 98% sequence identity) with the natural protein, 
but the collective library include variants at all or nearly 
all of the positions in the protein. pcr mutagenesis is 
particularly suitable for natural DNA binding proteins which 
have not been extensively characterized, and for which there 

lltLle ^"nation about which amino acid residues are 
cratxcal for binding. Kor other DNA binding proteins, such as 
iaci for which prior studies have already identified certain 
positions as being important for binding, mutagenesis can be 
focussed on these positions. For example, the coding sequence 
of the natural protein can be synthesized on a DNA 
synthesizer, but with the introduction of randomized codons at 
the critical loci for binding. 

_ The methods can screen multiple natural DNA binding 
proteins, or variants thereof, simultaneously. The methods 
can also screen potential binding proteins containing repeated 
copies of a natural binding domain or binding domains obtained 
from more than one natural protein. The potential DNA binding 
proteins can also be variants of a consensus DNA binding 
protem sequence or any theoretical DNA sequence thought to 
have DNA binding properties. The potential DNA binding 
proteins can also constitute random sequences from an epitope 
library encoding all or a substantial number of all possible 
peptide epitopes of a given length. 
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Surprisingly, it has been found thai- ^ * 
evoiutionary «e«,= d is su fficlently ZZTlT l 
sslact variants of a natural >,<„*• 11 Ca " 

i T =v.a characteristics for us « ln S."^."^* 

sweetie M thcC Thus, in aeneral ^ ^Z ^T 
£55 a specific 0„A secuenc. knoOT l^Jl 

nistones are suitable. 

In some applications, however i- i« h • ^ 
15 evolve nwn u ■ ^ ■ er ' 1 ^ ls desirable to 

volve a DNA binding protein having specificity tor a 

20 r^rr for that — - e~ p t; in 

2: ~ a ;; * on can be taiiored to ^ — ^ * 

and Iw "taming affinity for the specific sequence 

and showing improved characteristics re^atWe to tho <■ T 
- M.ding protein. If variants of ^ 

» it:™? a r being screened . sm 

-TZeoJlT for each DNA binding protein ' 

he recognition sequence for that bi nding protein . Families 
of oligonucleotide variants are then produced separately for 

IT, L bindin9 Pr ° tein — ^ vecLr e cL ng 
that binding protein and the corresponding recognition 

varLnts : " Sele<=ti ° n ^ " *" 

variants to a single natural binding protein. 

tow a ^ J h l re 3 ° f Strat ^i" to drive evolution 

j5 toward selection of DNA binding proteins having specificity 

or a glven sequence . FQr exampl ^ affin . ty purificat . on 

TJZZ\T Perf0rmed " PrSSenCe ° f 3 la ^ S — of DNA 

lacking the specific binding site present in the vector. 

really, this DNA constitutes a derivative of the vector fro, 
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vhich the specific binding , ltm for the DNJ 

been deleted. Howeveri bulx Dm co M ercJl sour cl 

The presence of DMA Peking the specific binding siteTtL 
= screenmg buffer accelerate, dissociation of n«l bin L 

proteins bound other than at the specific sit re^iL ■ 

..a •^^rsr.srz.r r - 

express /J-glactosxdase, and give rise to blue colonies. 
20 (3) PPtimiza h.ion of r.^v^. 

the DKi „ I l tb ' rUSi °" " Sed in th = ab °"* "thod., 

on. d„ • 11Srly ' lf th « ™* binding protein has „or. than 

can b"e ; ^ """"* Unk " S M « in 

D»* * , ^ " "° e £ "« d •volutionary process , 8 

"r I™ "ith the Dm hiding protein. 

■° bindiTproi.^ "T ^ ^ e " C ° dlng "° th U " ke " "« * ™» 
AlterZt "T °" SUb 5" tM to PGR «,„tagen.sis. 

" r?'""" 5 °™ 6S and screened h.for. 
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likelihood of disrupting the DNA binding protein. However 
other viable sites of insertion are readily selected using'the 
forced evolution method. For example, a library of vectors is 
constructed in which a common peptide is inserted at different 
sites zn a DNA binding protein under test. The library of 
vectors is transformed and propagated in host cells, and the 
vectors isolated and panned for binding to the peptide 
receptor via the displayed peptide. The vectors binding to 
the receptor are those in which the site of insertion resulted 
m display of the peptide without disrupting the binding 
characteristics of the DNA binding protein. 



( 5 ) Successive Rounds of Enrinhnipnf 
A single round of propagation and affinity 
purification of a library of potential DNA binding proteins 
selects a pool of vectors encoding DNA binding proteins that 
are at least somewhat useful for screening peptides in the 
peptides on plasmids method. Further optimized DNA binding 
proteins are obtained by performing successive rounds of 
2 0 enrichment. That is, vectors present in complexes bound to 
the receptor in the screening process are isolated, 
retransformed into host cells, and the selection process is 
repeated. Each round of selection results in greater 
enrichment for DNA binding proteins having optimal 
characteristics for use in screening peptides, because the 
vectors encoding these proteins are statistically most likely 
to survive the selection. The stringency of binding and wash 
buffers can be increased in successive rounds of screening as 
is the case when screening peptide libraries. Typically, 
vectors surviving four rounds of affinity selection encode DNA 
binding protein having highly suitable characteristics for 
peptide display. 

In general, a DNA binding protein surviving the 
evolutionary process is optimized for use in the peptides on 
plasmids method under the same conditions as those employed in 
the evolutionary process. Thus, the same or similar 
conditions should be employed in subsequent Use of a DNA 
binding protein in the peptides on plasmids method as were 
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EXAMPLE 1 

p-«« ai hsdRl7 « ^» T„10 // recai eMii 

10 »<ap»3i9 !! j !jtll , MI ' . 17 ' *(">pr-/.pc), 

" hlCh «du=e the ,vJL bi ; d PePtid " " tt " 

o»PT. and cipA.P. P"teo lysis include ^ ^ ^ 

PBAD 18 contai „ s " th « """"9 Pl«. id . p Us „ ld 

«»• to p « Blt -i* T paS322 orl »i» «a th. «, 

below, »li9o n u cl .otid.. o»- 286 and 

« 0N " 2a6 »c™ c rs rr gga iac ™ *» - 

ON-287 5--CGT nw- fSEQ ID N0 ^8J 

--™«^„". — 

The ampi ificati r fc . 
35 manufacturer's instruct, to the 

Polymerase (Nev Eng!^"": J*^* fW the USe «* Vent- DNA 
nonhomologous s- rll 1 ' " ° N ~ 286 C ° ntains a 

ribo Some binding lll at G ** dS 3n «»« a consensus 

(see cold and stormo, Methods in 
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Enzymology (Goeddel, ed Bo ^ 

imtiation c Mo „ of i acI from gtg ^ 1"" *™»- «» 
COdons 356 «M "7 of ,. cl t0 „ '° ™- 287 Ganges 

„ Mds , aeJ slt - ^ "" ^ two silent 

=lo„i„ g of the MeI ™ 8 IacI stop codon. 

Product into pl , slllli D0AD18 ^J"*/ 1 * 1 "" '-Pliriction 

325 bp a P»", by amplifying ,„ unr ! C T- s P«od 

°a *>pa»i„ e rec . ptor JJ«™ segneno. (the hm „ 

87-90 Mq Q1 % England et al ro Bc r ^ 

90 f l99 D, and U.S. S.N. 07/645 029 ti , '' 219 > 

*° th ° f ar. incorporated he ; ein ' d 1991 ' 

^^nucleotides ON - 295 and ^ > with 

^6, shown belov; 



15 ON-295 



ON-29 5 



- - r s ^ c ™ - AGA A „ CGG TAG 

-CGC CAT CGA TGA ATT TG AGC N0:70, 

- OTG TGA TGA AGA " ^ ~ « ^ «» - 

ON-295 adds an tfd eI site 

a, Plified fragment> an a r 0N a . at one end of 

^ ° th - «*. Cloning of the " / "* 1 Slte - d '«°. 

PJS100 produced pl aslnid J sl0 f 2 WeI t0 ^ agm ent into 

oligonucleotides o N - 3i2 ^/o^™"! ^-"tary 

XhSl f -^nt at the 3- end of , acI ^ ^ t0 

oligonucleotides add <= ln pJS102 - These 

— er (GADGA f^"^ * «~ a.ino acid 

f SEQ ID "0:7], to the end of the ^7"" " ^^^^VT 
introduce an SfH site ln "^-type iael seguence/ 

»™» are shown belw . the Se <^"« -coding the spacer, 
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ON-313 



5'-CTA GAT TAG GTT ACA ACT TTG AAC TGA CGA CGC AGG 
AAA CCA CCG TAG GCC CCG TCG GCC CCC TGC CCG 
CTC-3' [SEQ ID NO:73] 

The library plasmid pMC5 was constructed by cloning 
complementary oligonucleotides ON-3 3 5 and ON-3 3 6 to replace 
the Sfil to HlndXll dynorphin B segment of P MC3 , as shown in 
Fig. 2. Oligonucleotides ON-335 and ON-336 are shown below: 

ON-335 5'-GGG CCT AAT TAA TTA-3 1 [SEQ ID NO; 74] 

ON-336 5'-AGC TTA ATT AAT TAG GCC CCG T-3 ' [SEQ ID NO:75] 

Plasmid P MC3 is available in strain ARI161 from the American 
Type Culture Collection under the accession number ATCC No 
63318. 



EXAMPLE 2 

Construction of a Random nod e camer Peptide r.ihran; 
Oligonucleotide ON-332 was synthesized with the 

sequence : 

5'-GT GGC GCC (NNK) 12 TAA GGT CTC G-3 \ [SEQ ID NO; 76] 

where N is A , c, G, or T (equimolar) and K is G or T (see 
Cwirla et al . , supra). The oligonucleotide was purified by 
HPLC and phosphorylated with T4 kinase (New England Biolabs) . 
The two half-site oligonucleotides ON-3 69 and ON-370 were 
phosphorylated during synthesis and are shown below: 

ON-3 69 5' -GGC GCC ACC GT-3 ' [SEQ ID NO: 77] 

ON-370 5'-AGC TCG AGA CCT TA-3 » [SEQ ID NO;78] 

ON-369 and ON-370 annealed to ON-332 produce Sfil and tfindlll- 
compatible ends, respectively, but the ligated product does 
not have either recognition sequence (see Fig. 2). 

Four hundred pmoles of each oligonucleotide were 
annealed in a 25 reaction buffer (10 mM Tris, pH 7.4, 1 mM 
EDTA, ioo mM NaCl) , by heating to 65°c for 10 min. and cooling 
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for 30 lin. to room temperature, vector pMC5 was digested to 
completion with Sfil and HindllX, the vector backbone was 
isolated by 4 rounds of washing with TE buffer (io mM Tris 
PH 8.0, 1 mM EDTA) in a Centricon 100 microconcentrator 
(Amicon) by the manufacturer's instructions, followed by 
phenol extraction and ethanol precipitation. The annealed 
oligonucleotides were added to 64 micrograms of digested P MC5 
at a 4:1 molar ratio in a 3.2 ml ligation reaction containing 
5% PEG, 3200 units of ffindlll, 194 Weiss units of T4 ligase 
(New England Biolabs) , l mM ATP, 20 mM Tris, pH 7.5, 10 mM 
MgC12, 0.1 mM EDTA, 50 M g/ml BSA, and 2 mM DTT. The reaction 
was split equally into 3 tubes and incubated overnight at 
15°C. 

After ethanol precipitation, l/is of the ligated DNA 
(4 Mg) was introduced into MC1061 (80 M i) by electroporation 
(Dower et al . , Nucl . Acids Res. 16, 6127-6145 (1988), 
incorporated herein by reference), to yield 5.5 x 10* 
independent transf ormants . The library was amplified 
approximately 1000-fold in i liter of LB/ioo Mg /ml ampicillin 
by growth of the transf ormants at 3 7'C to an A soo of 1. The 
cells containing the library were concentrated by 
centrifugation at 5500 x g for 6 tnin., washed once in ice-cold 
50 mM Tris (p H 7.6), 10 mM EDTA, 100 mM KC1, followed by a 
wash in ice-cold 10 mM Tris, 0.1 mM EDTA, 100 mM KC1. The 
final pellet was resuspended in 16 ml of HEG buffer (35 mM 
HEPES/KOH pH 7.5, 0 . 1 mM EDTA, 100 mM Na Glutamate) , 
distributed into 19 tubes of 1.0 ml each, frozen on dry ice, 
and stored at -70°C. 



EXAMPLE 3 
Panning the Library 
One aliquot (l.o ml) of the library prepared in 
Example 2 was thawed on . ice and added to 9 ml of lysis buffer 
(35 mM HEPES {pH 7.5 with KOH} , 0.1 mM EDTA, 100 mM Na 
glutamate, 5% glycerol, 0.3 mg/ml BSA, 1 mM DTT, and 0.1 mM 
PMSF). Lysozyme was added (0.3 ml at lb mg/ml in HEG), and 
the mixture was incubated on ice for 1 hr. 
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The cellular debris was removed by centrif ugation of 
the lysate at 20,000 x g for 15 min., and the supernatant was 
concentrated by centrif ugation in a Centriprep® 100 
concentrator (Amicon) at 500 x g for 40 min. The concentrated 
supernatant (about 0.5 ml) was washed with 10 ml of HEG buffer 
and centrif uged as before. A sample (5%) of the total lysate 
was removed to determine the pre-panned input of plasmid 
complexes. 

An alternate method for partially purifying and 
concentrating the lysate is as follows. About 2.0 ml of the 
frozen cells in HEG are thawed on ice, and then 8 ml of lysis 
buffer without Na glutamate (high ionic strength inhibits 
lysozyme; DTT is optional) are added to the cells, and the 
mixture is incubated on ice for 1 hr. The cellular debris is 
removed from the lysate by centrif ugation at 20,000 x g for 
15 min., and the supernatant is loaded onto a Sephacryl® S-400 
High Resolution (Pharmacia) gel-filtration column (22 mm x 
250 mm) . The plasmid-f usion protein complexes elute in the 
void volume. The void volume (30 ml) is concentrated with two 
Centriprep® 100 concentrators, as described above. After 
adjusting the Na glutamate concentration of the concentrate, 
one carries out the remainder of the procedure in the same 
manner as with the first method. 

Half of the remaining concentrated lysate was added 
to D32.39-antibody-coated sheep-anti-mouse (Fc) -coupled 
magnetic beads (10 /xg of D32.39 added to 5 mg Dynal beads for 
1 hr. at 25°C followed by 6 washes with HEG), and half was 
added to uncoated beads. After incubating the lysates with 
the beads at 0°C for 1 hr. with shaking, the beads were washed 
three times with 5 ml of cold HEG/ 0.1% BSA and then three 
times with HEG using a MACS 0.6 tesla magnet (Miltenyi Biotec 
GmBH) to immobilize the beads. The plasmids were dissociated 
from the beads by phenol extraction, and after adding 20 jug of 
glycogen (Boehringer Mannheim) , the DNA was precipitated with 
an equal volume of isopropanol. The pellet was washed with 
75% ethanol, and the DNA was resuspended in either 4 fsl 
(panned DNA) • or 400 /il (pre-panned DNA) of H 2 0. Strain MC1061 
was transformed using 2 m! each of the DNA solutions to permit 
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counts of recovered plasmids and amplification of the selected 
plasmids. The results of the panning are shown below in 
Table 1. 



10 





TAB LB 1 




Number of Transf onaants 


Panning 
Round 


Input 


Ab D32.39 
Beads 


Uncoated 
Beads 


1 


1.6 x 10 10 


9 x 10 7 . 


1.7 X 10 S 


2 


1.4 x 10 11 


6.1 X 10 7 


1.2 x 10 4 


3 


1.7 x 10 11 


2.0 X 10 9 


40 


4 




1.6 X 10 11 


4 X 10 4 
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EXAMPLE 4 
ELISA Analysis of the Library 
An ELISA was used to test MC10S1 transf ormants from 
the second, third, and fourth rounds for D32 . 39-specif ic 
ligands (see Example 3) . The ELISA was performed in a 96-well 
plate (Beckman) . Single colonies of transf ormants obtained 
from panning were grown overnight in LB/ 100 ^9/^1 ampicillin 
at 37 °C, The overnight cultures were diluted 1/10 in 3 ml 
LB/10.0 Mg/ml ampicillin and grown 1 hr. The expression of the 
lac repressor-peptide fusions was induced by the addition of 
arabinose to a final concentration of 0.2%. 

The cells were lysed as described above in 1 ml of 
lysis buffer plus lysozyme and stored at -70 °C. Thawed crude 
lysate was added to each of 2 wells (100 ^l/well) , and the 
plate was incubated at 37 °C. After 45 min, 100 jul of 1% BSA 
in PBS (10 mM NaP04, pH 7.4, 120 mM NaCl, and 2 . 7 nuM KC1) were 
added for an additional 15 min. at 3 7°C, followed by 3 washes 
with PBS/0.05% Tween 20, Each well then was blocked with 1% 
BSA in PBS (200 Ml/well) for 30 min. at 37°C, and the wells 
were washed as before. 

The primary antibody, D32.39 (100 ^1 of antibody at 
1 Mg/ml in PBS/0.1% BSA) was added to each well, the plate was 
incubated at room temperature for 1 hr. , and then each well 



WO 96/40987 _ 

PCT/US96/09809 

53 

was washed as before. The secondary antibody, alkaline 
phosphatase-conjugated Goat-anti-mouse antibody (Gibco-BRL) , 
was diluted 1/3000 in PBS/0,1% BSA and added to each well 
(100 Ml/ well) ; the plate was then incubated for 1 hr at room 
5 temperature. After three washes with P3S/0.05% Tween 2 0 and 
two with TBS (10 mM Tris pH 7.5, 150 mM NaCl) , the ELISA was 
developed with 4 mg/ml p-nitrophenyl phosphate in 1 M 
diethanolamine/HCl pH 9.8, 0.24 mM MgCl 2 (200 Ml/well). 

The reaction was stopped after 6 min. by the 
10 addition of 2 M NaOH (50 ^l/well) , and the absorbance at 
4 05 nm was measured on a plate reader (a Bioraek, from 
Beckinan) . The positive control for the ELISA was MC1061 
transformed with pMC3 , encoding the lac repressor-dynorphin B 
fusion. The negative controls were wells not coated with 
15 lysate. Background variability was calculated from the wells 
containing lysates from 16 colonies selected at random from 
the library, none of which scored significantly above the 
negative controls. Wells were scored as positive if the 
measured absorbance was at least two standard deviations above 
2 0 background. 

Of randomly picked colonies, 35 of 58 (60%) tested 
positive by ELISA: 11 of 2 0 from round two, 12 of 16 from 
round three, and 12 of 22 from round four. None of 16 random 
colonies from the unpanned library scored significantly above 
25 background. These data demonstrate the rapid enrichment of 
specific ligands achieved by the present invention: after 
only two rounds of panning, the majority of plasmids encoded 
peptides with affinity for the D32.39 antibody. 

To determine the structure of the peptide ligands 
30 obtained by the present method, plasmids from both ELISA 

positive and ELISA negative colonies obtained after panning 
were sequenced. Double stranded plasmid DNA, isolated from 
strain XLl-Blue, was sequenced using Sequenase® (US 
Biochemicals) according to the instructions supplied by the 
35 manufacturer . 

The translated peptide sequence for all ELISA 
positive colonies examined shared the consensus sequence shown 
in Fig. 3. The preferred recognition sequence for the D32.39 



WO 96/40987 

PCT/US96/09809 

54 

antibody apparently covers a six amino acid region of the 
dynorphin B peptide (RQFKW) . in the first position, arginine 
is invariant for all of the ELISA positive clones. No strong 
bias was evident for residues in the second position. In the 
5 third position, however, five amino acids (phenylalanine, 

histidine, asparagine, tyrosine, and tryptophan, in order of 
frequency) account for 98% of the residues. Of these, the 
aromatic amino acids comprise 74% of this total. The fourth 
position shows a strong bias for the positively charged 
10 residues lysine (69%) and arginine (21%) . The fifth position 
is occupied almost exclusively by hydrophobic residues, most 
of which are valine (81%). Valine and threonine predominate 
in the sixth position (76%) , with serine and isoleucine 
accounting for most of the remaining amino acids. 
15 of the ELISA negative clones obtained after panning, 

greater than half showed peptide sequence similarity to the 
consensus motif (Fig. 3). None of 19 isolates sequenced from 
the unpanned library showed any such similarity. Some of 
these ELISA negative sequences differ enough from the 
consensus that their affinity for the antibody may be 
insufficient to permit detection in the ELISA. There are, 
however, ELISA negative sequences identical in the five 
conserved amino acids of the consensus region to clones that 
scored positive (e.g., #23 and #57). There may be amino acids 
25 outside the consensus region that affect binding of the 
peptide to antibody or its susceptibility to 3. coli 
proteases, or its availability in the ELISA. That even the 
ELISA negative clones frequently have an obvious consensus 
sequence demonstrates the utility of the present invention for 
isolating ligands for biological receptors. 

Example 5 

i* Optimization of linkers for headpiece dimer 

display 

To obtain headpiece dimer polypeptides that bind to 
their encoding plasmids with sufficient stability to 
facilitate affinity purification, two headpiece domains were 
inserted in a construct adjoined by random linkers (Figs. 5 & 
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6a) . The vector (pDimerl) contained two repeated segments of 
the lac repressor gene, respectively encoding residues 1-49 
and 2-49 of the headpiece DNA binding domain. These two 
segments were linked by a sequence encoding a 4-5 random 
residue "headpiece linker" which, based on molecular modeling, 
might allow positioning of the headpiece DNA binding domains 
for stable binding to IacO s sites present on the parent 
plasmid. Fused to the second headpiece domain was a sequence 
encoding a 4 random residue "display linker" designed to 
facilitate the C-terminal display of peptide ligands. To 
screen the initial "linker library", a 7 residue epitope 
(RQFKWT) for the D32.39 monoclonal antibody (Barrett & 
Goldstein, Neuropeptides 6, 113-120 (1985)) was fused to the 
C-terminal display linker. To increase the chance of finding 
active headpiece dimers, the vector was designed to have two 
lac0 3 sites. 

Headpiece dimer "linker" library plasmid pDIMERl was 
constructed as follows. 10 ng pMC5 (encoding lad headpiece 
domain) as a template, primers 0N-929 (TATTTGCACGGCGTCACACTT 
[SEQ ID NO: 79]) and ON-930 (CCGCGCCTGGGCCCAGGGAATGTAATTGAGCTC- 
CGCCATCGCCGCTT [SEQ ID NO:80]) were used in a (25 cycle) PCR 
reaction to modify the ends of the region encoding the first 
49 residues of lad tc farm headpiece #1. About 1 of, the 
modified fragment encoding headpiece #1 was digested with 
BamHI and Apal, gel purified, and inserted between the 3amHI 
and Apal sites of pMC5, replacing the lad coding region, to 
form intermediate plasmid pMCSdlad. To construct "headpiece 
#2", PCR primers ON-938 (CGATGGCGGAGCTCAATTACATTCCC- (NNK) 5 ~ 
AAACCAGTAACGTTATACGAT [SEQ ID NO: 31]), ON-939 (CGATGGCGGAGCTC- 
AATTACATTCCC- (NNK) 4 -AAACCAGTAACGTTATACGAT [SEQ ID NO: 82]), and 
ON-940 (CGCCCGCCAAGCTTAGGTTACAACTTTGAACTGACG- (MNN) 4 -GGGAATGTA- 
ATTCAGCTCCGCCAT [SEQ ID NO:83])/ were used to attach sequences 
encoding a 4 or 5 random residue "headpiece" linker, a 4 
random residue "display" linker, and the D32.39 monoclonal 
antibody epitope (RQFKWT [SEQ ID NO: 66]) (Barrett & 
Goldstein, Neuropeptides 6, 113-120 (1985); Cull et al - , Proc . 
Natl. Acad. Scl. USA 89, 1865-1869 (1992) ) to codons 2 through 
49 of the lad headpiece. Approximately 1 of the end- 
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modified DNA fragment encoding headpiece #2 was digested with 
SstI and Hiadlll, gel purified and ligated into the SstI and 
HindUI sites of pMCSdlacI. Plasmids encoding four-random- 
residue headpiece linkers were combined with those encoding 
5 five-random-residue linkers at a ratio of approximately 10:1 
to make the pDIMERl linker library. The pDIMERl library was 
introduced into bacterial strain ARI 23 0 by electroporation to 
produce a library of 3 x 10 s individual transf ormants . 

The headpiece dimer gene was expressed under the 
10 control of the araB promoter using three separate induction 
levels. The linker library was amplified in three 325 ml 
LB/Amp 100 medium (100 nq/ml ampicillin) pools containing, LB 
with no additives for basal "A" promoter induction, LB with 
0.1% glucose followed by promoter induction with 0.2% 
15 L-arabinose for 30 min prior to harvest to give partial "B" 
induction levels, and LB with 0.2% L-arabinose for 15 min 
prior to harvest for full »c" promoter induction. 

Upon cell lysis, the subset of these plasmids that 
displayed the D32.39 epitope was enriched relative to other 
plasmids in the population. Stable complexes were captured by 
panning the lysate in microtiter wells coated with immobilized 
D32.39 antibody. Panning was carried out in Immulon 4 
mxcrotiter wells (Dynatech Laboratories) coated with 2 fig per 
well D32.39 antibody as described in Example 6, except that 
HEK/1% BSA (35 mM HEPES (Research Organics Inc.), pH 7.5 with 
KOH, 0.1 mM EDTA, 50 iruM KC1 , 1% Bovine Serum Albumin, Fraction 
V) replaced HEKL/BSA as the primary incubation and wash 
buffer. In all rounds, 0.1 to 0.2 mg/ml sonicated herring DNA 
was included in the incubation buffer as a nonspecific DNA 
30 competitor. In rounds three and four of panning, 5 to 
10 Mg/ml of self-annealed ON-413, a lacO a containing 
oligonucleotide (GAA TTC AAT TGT GAG CGC TCA CAA TTG AAT TC 
[SEQ ID NO: 84]) was included in the incubation buffer as a 
competitor. Following a one hour incubation at 4°C, unbound 
headpiece dimer complexes were washed from the wells four 
times with HEK/BSA followed by two washes with HEK. Bound 
plasmids were extracted from the wells using 50 /il/well 
TE/NaCl buffer (10 mM Tris-HCl, (pH 7.5) / 1 mM EDTA / 0.5 M 



20 



25 



35 



10 



20 



25 



30 



35 



WO 96/40987 

PCT/US96/09809 

57 

NaCl) mixed with 50 M l/well phenol. After addition of 1 M l 
. glycogen carrier (20 mg/ml, Boehringer Mannheim), the 
recovered plasmids were precipitated with an equal volume of 
isopropanol, followed by a 10 minute spin at 14,000 rpm in a 
microfuge. Plasmids were resuspended in 4 M i water and used 
to transform bacterial strain ARI 230 for recovery counts and 
further rounds of panning. After two rounds of panning, 
enrichment numbers indicated that the pools grown under' 
conditions of »B» (partial), and »C» (full) promoter 
induction, gave the best enrichment. Based on this finding, 
only the B and C pools were used in rounds 3 and 4. 

Sequencing of individual clones selected after. four 
rounds of panning revealed the primary structure of their 
linkers. Of 22 clones that yielded readable sequence, 5 
contained frameshifts or stop codons which would prevent 
translation of the D32.39 epitope. Two B pool clones, 
isolates B7 and B10, were present as duplicates, indicating 
selective enrichment by the panning procedure from less than 
one in 10 s to more than one in six. Surprisingly, one of the 
enriched clones, isolate 310, and one C pool clone, C5 , had 
frameshifts early in the second headpiece domain with a second 
frameshift late in the headpiece coding sequence that restored 
the reading frame of the D32.39 epitope (Fig. 6b). 

To assess which clones encoded the most stable DNA 
binding proteins while displaying the epitope in the most 
favorable way, the clones were individually evaluated in a 
panning experiment. Each clone having the D3 2.3 9 epitope in 
the correct frame was examined together with one clone having 
a frameshifted epitope as a negative control. An intact lad 
construct (see Example 3) served as a positive control. Each 
clone was panned against D32.39 Ab and also against MAb344 as 
a negative control. Specific enrichment was evaluated by 
transformation of E. coli with recovered plasmids. 

After four rounds of panning, individual clones were 
grown in LB/Ampl00/0 . 1% glucose for two hours at 37°C. 
Following addition of L-arabinose to 0.2% (B induction), 
cultures were grown for an additional 3 0 min, then chilled on 
ice for harvest. 1 ml of each- culture was microfuged for 
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2 min at 14, 000 rpm. The cells were washed with 0 . 5 ml ice 
cold WTEK buffer (50 mM Tris, (pH 7.5), 10 mM EDTA, 100 mM 
KC1), centrifuged for 2 min, washed with 0.25 ml cold TEK 
buffer (10 mM Tris, (pH 7.5) / 0.1 mM EDTA / 100 mM KC1) , 
centrifuged, then resuspended in 100 M l cold HEK buffer! To 
each resuspended cell culture, 0.9 ml lysis buffer (35 mM 
HEFES, (pH 7.5 with KOH) , 0.1 mM EDTA, 5% Glycerol, 1 mM DTT, 
0.1 mM pMSF (phenylmethylsulfonyl fluoride), o.l mg/ml BSA) 
was added. Cell were lysed by adding 20 M l of io mg/ml 
lysozyzme (Boehringer Mannheim) to each tube followed by 
incubation on ice for 1 hr. The lysed cell cultures were then 
microfuged at 14,000 rpm for 10 min at 4°C, and the 
supernatant transferred to a new tube. 

To evaluate each clone, 10 M i of clear lysate was 
added to methacrylate beads (Affi-prep 10 support, Bio Had) 
coated with the D32.34 monoclonal antibody, or negative 
control MAb344, suspended in 0.5 ml HEK/BSA/0.01 mg/ml herring 
DNA. After incubation at 4°c on a tube rotator for one hour, 
beads were washed twice with cold HEK/ BSA and twice with HEK 
over a 50 min period. The remaining antibody-bound plasmid 
complexes were recovered from the beads by phenol extraction 
and isopropanol precipitation. Enrichment was defined as the 
number of transforming units of plasmid recovered panning 
against the D3 2.3 9 antibody beads divided by the number 
recovered panning against the MAb344 control antibody. 

The individual evaluations revealed relatively few 
clones that yielded greater recovery with D32.39 Ab compared 
to the negative control. Only four isolates (B7, B10, C4 , C5 
m Fig. 6) showed enrichment greater than two fold. The best 
clones were B7 and B10, the same isolates that represented a 
large fraction of the round four population. These isolates 
yielded enrichment of 8 and 28 fold respectively. Of the four 
clones showing specific enrichment, three contained cysteine 
residues in their headpiece spanning linkers and all four had 
a proline residue in their display linkers suggesting that 
some degree of activity might be conferred by these residues. 
Surprisingly good enrichment was achieved by isolates B10 and 
C5, which contain reading frame shifts in the region encoding 
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the second headpiece resulting in an entirely different amino 
acid sequence from that encoded in the first headpiece domain 
(Fig. 6) . Overall, the headpiece dimer clones performed less 
well than the lad system described in Example 3. 

2- Isolation of Mut ant Headpiece Dimer* 
To increase headpiece dimer/DNA complex stability 
and thereby increase the panning performance of individual 
clones, random mutations were introduced in the regions 
encoding the headpiece dimer and adjoining linkers. Some 
mutations in lad resulting in tighter-binding mutants have 
been reported (Betz & Sadler, J". Mol . Biol. 105, 293-319 
(1976); Kleina & Miller, J. Mol . Biol. 212, 295-318 (1990); 
Kolkhof, Nucl. Acids. Res. 20, 5035-5039 (1992); Maurizot & 
Grebert, FE3S Lett. 239(1), 105-108 (1983); Miller, The Operon 
(Miller & Reznikoff, eds.), pp. 31-88 (1980), Cold Spring 
Harbor Laboratory, Cold Spring Harbor, NY) . 

The starting population for mutagenesis was the 
headpiece dimer B pool, obtained after four rounds of random 
linker library panning. Starting with pDIMERl B pool 
plasmids, isolated after four rounds of affinity purification 
as a template, flanking primers were used in an adaptation of 
mutagenic PCR (Gram et al . , Proc . Natl. Acad. Sci . USA 89, 
3576-3580 (1992); Leung et al . , Technique 1, 11-15 (1989)) to 
generate mutations within the headpiece dimer and linker 
coding sequence. Approximately 2 of mutated DNA fragments 
generated by PCR was digested using Nhel and Hindll! and 
cloned into plasmid pJS145 (a single IacO s -containing vector). 
The ligation was amplified in ARI 280 using A and B induction 
conditions, to lower the total amount of headpiece dimer 
protein in the cells, as described above to produce a library 
of 1.6 x 10 s individual transf ormants . Stable headpiece 
dimer/plasmid complexes were selected from this population by 
panning for four rounds on methacrylate beads (Affi-prep 10 
support, Bio Rad) coated with the D32.39 monoclonal antibody. 
Elution from beads was carried out for one hour at 4°C using 
100 Ml/well HEK buffer containing 2 . 8 ji M synthetic peptide 
RBO-ll (RQFKWT [SEQ ID NO:66]). Complexes were eluted with 
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peptide containing the RQFXWT [SEQ ID NO: 66] epitope in all 
four rounds. 

To determine whether any specific mutations had been 
selected through four rounds of panning and amplification, the 
regions encoding the headpiece DNA binding domain and 
adjoining linkers were sequenced from individual clones, six 
of eight A pool clones and one B pool clone had a specific Q 
to R mutation at position 18 within one or both headpiece 
domains (Fig. 6). Significantly this Q18 to R mutation falls 
in a portion of the headpiece DNA binding helix that is 
critical for the recognition of operator DNA by lacl (Boelens 
et al., J\ Mol. Biol. 193, 213-216 (1987); Chuprina et al . , 
1993; Ebright, Proc. Natl. Acad. Sci . USA 83, 303-307 (1986); 
Lamerichs et al . , Biochemistry 28, 2985-2991 (1989); Lehming 
15 et al., EMBO J., 9, 615-621 (1990); Lehming et al . , EMBQ J. 6, 
3145-3153 (1987); Lehming et al . , Proc . Natl. Acad. Scl. USA 
35, 7947-7951 (1983); Sartorius et al . f 2MBO J. 8, 1265-1270 
(1939)). Many other mutations were present throughout the 
headpiece dimer and linker coding regions. 

To evaluate individual mutant headpiece dimer 
clones, single clones were analyzed for enrichment under the 
conditions described above. Non-mutant headpiece dimer B7 , a 
lad clone, and an out-of-frame headpiece dimer clone were 
used as controls. The headpiece linkers chosen to display the 
mutant headpiece clone had each been identified in more than 
clonal isolate in the linker optimisation screening. The 
promoter induction conditions for these tests were A and B 
reflecting the conditions used for the initial selection of 
the mutants. 
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TABLE 2 






Enrichment 


isolate 


Description 


Exp. #1 


Exp. #2 


Exp. #3 


ARI 192 


wt. lad 
(+CT) 






155 


B2.2 


Framshift 
(N.C.) 


0.44 


0.77 


0.44 


B7 


Wt. HpD . 


0.64 


1.24 


10-0 


A4.2 


mutant HpD. 


6.2 


3.3 


61.0 


A4.5 


mutant HdD. 


360 


377 


1017 


A4.7 


mutant HpD. 


11.6 






A4 . 8 


mutant HoD. 


95 .0 


11.6 


70.0 


B4 .5 


mutant HpD, 


365 


116 


850 


B4.7 


mutant HdD. 


383 


3.5 


18 . 0 


B4 . 8 


mutant HpD. 




14 . 6 


57. 0 



Experiments #1 and #2 were carried out using basal (A) 
promoter induction conditions, experiment #3 was carried out 
using partial (B) promoter induction. 

Table 2 shows greater enrichment was obtained from 
all of the selected mutants than the wild-type headpiece dimer 
B7 . Of the nine mutants tested, two isolates from separate 
pools, numbered A4.5 and B4.5, showed the greatest enrichment 
under both basal and partial promoter induction conditions. 
These two clones share the same Q18 to R mutation in their 
second headpiece DNA recognition helices suggesting that this 
mutation might be important for DNA binding. These clones 
also contain the GRCR headpiece linker found in the B7 
isolate, although mutant A4.5 contains a display linker that 
is different than the one shared by B7 and mutant B4.5 
(Fig. 6). 

Expression levels of several mutant headpiece dimer 
proteins were analyzed in whole cell lysates on SDS/PAGE to 
determine whether increased enrichment was due to increased 
expression levels. Staining of proteins from cells grown 
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under conditions of A or B induction showed little difference 
in 14.5 kD headpiece dimer polypeptide expression between 
mutants A4.5 and B4.5 as compared to the non-mutant B7. 
Western blot analysis of these clones using the D32.39 
5 antibody showed similar levels of expression, indicating that 
levels of enrichment for individual mutants were probably due 
to structure and not expression levels. 

3 * Screening a Random Library Using Optimized 
10 Headpieces and Linkers 

The library was constructed in plasmid pCMG14 which 
contains headpiece dimer mutant A4 . 5 under the control of the 
araB promoter. A series of restriction sites at the 3' end of 
the gene facilitate cloning of synthetic oligonucleotides, 
allowing fusion of the headpiece dimer display linker to a 
random peptide. Each member of the random library consists of 
a peptide-displaying headpiece dimer bound to its encoding 
plasmid. 

A random dodscamer library comprising 10 9 
oligonucleotide members was inserted into pCMG14. As a 
control, a lacl-based peptides-on-plasmids library of similar 
size using the same random library oligonucleotides was 
constructed in parallel. Identical bacterial strains, panning 
conditions, and basal promoter induction was used for both 
25 libraries. Libraries were panned in microtiter wells coated 
with D32.39 antibody or the same amount of MAb344, as a 
negative control. Recovery of plasmids during panning yielded 
enrichment for both libraries. By round 4, the headpiece 
dimer library showed 1855 fold enrichment over the negative 
control while the lad library yielded 1150 fold enrichment. 
Fig. 8 shows that isolates picked from both libraries encoded 
peptide structures similar to the D32.39 antibody epitope 
(RQFKWT [SEQ ID NO:66]). The enrichment and sequencing 
results show that the headpiece dimer . system selects peptide 
sequences that bind specifically to a receptor. 

To verify receptor specificity and to determine the 
relative affinity of the headpiece dimer versus the lad- 
library derived peptide sequences obtained through panning, 
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the peptide-encoding sequences from each fourth round library 
pool were transferred into a vector such that the peptides 
would be fused in frame with the maltose binding protein (MBP) 
(Bedouelle & Duplay / Eur. J. Biochem. 171, 541-549 (1988) ; 
5 Duplay et al . , J*. Biol. Chem. 259, 10606-10613 (1984); Guan 

et al., Gene 67, 21-30 (1938); Maina et al . , Gene 74, 365-373 
(1988)). This transfer permitted comparison of the headpiece 
dimer and lad derived peptides fused in an identical fashion 
to the same carrier protein. 

10 Under the conditions employed, MBP exists primarily 

as a monomer (Blondel & Bedouelle, Prot. Engineer 4, 457-461 
(1991); Kellerman & Ferenci, Meth. Enzymol . 90, 459-463 
(1982) ; Richarme, Biochem. Biophys . Res, Comm. 105, 476-481 
(1982); Richarme, Biochem . Biophys. Acta. 748, 99-108 (1533)) 

15 and thus the MBP-peptide fusions would be expected to bind to 
receptor monovalently. This non-cooperative interaction 
should allow a good correlation between affinity of the 
peptide for the receptor and the level of receptor occupancy 
during binding and washing steps. The intensity of the ELISA 

20 signal is expected to correlate approximately with peptide 

affinity. Evidence supporting this hypothesis was obtained by 
comparing the ELISA signal strength produced by MBP fused to 
different epitopes of known affinity. Fig. 9 demonstrates 
that MBP fused to epitopes with affinities of 340 nM (pCMG33) 

25 and 0.51 nM (pCMG3 9) produced dramatically different ELISA 

signals. Using other peptide ligand families, the intensity 
of the signal in the MBP ELISA correlates approximately with 
the affinity of the ligand for a receptor. 

Ly sates of 2 3 randomly picked isolates from each 

30 library pool were tested in ELISAs with the D32.39 antibody 

test receptor, or MAb3 44 and BSA as controls. Clones without 
the correct insert DNA structure determined by sequencing were 
excluded from subsequent analysis. Fig. 9 shows that of 19 
clones from the headpiece dimer library pool included in the 

35 analysis, 13 showed MBP ELISA signals greater than 0.5. Of 21 
isolates from the lad library, however, only 2 yielded 
signals of 0.5 or greater. A comparison of the two data sets 
using an unpaired t test showed the difference was significant 
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(P<0.0001). This comparison indicates that the headpiece 
dimer system can selectively enrich ligands with higher 
affinity than those obtained by the multivalent lad-based 
peptides-on-plasmids system. 

5 

4 . Headpiece Dimer DNA Binding Studies 
The selection system by which headpiece dimers were 
selected demanded two things. First, that the protein bind to 
the plasmid that encoded it with acceptably high stability. 

10 Second, that the plasmid-protein complex display a peptide in 
such a way that it was available for binding to an immobilized 
receptor. Many mutations were present in the pool of selected 
headpiece dimers including the Q13 to R mutation, at a 
position known to be critical for lacO s sequence recognition 

15 in lad (Lehming et al . , EM30 J . 9, 615-621 (1990)). This 
experiment investigates whether some of the headpiece dimer 
mutants might employ DNA binding sites other than lacO St 

To compare the mechanism of DNA binding between the 
mutant headpiece dimer A4 . 5 and lad, two pairs of plasmids 

20 were constructed, one pair with, and the other pair without, 

lacO s binding sites. Removal of lacO s sites from the plasmids 
was carried out by replacing the Nhel to AlvNI fragment with a 
similar fragment that lacked IacO g . One member of each pair 
displayed the D32.39 epitope linked to chloramphenicol 

25 resistance, and the other carried ampicillin resistance but 

lacked the epitope. Starting with cells mixed in the ratio of 
"1 Cam r cell to 1000 Amp r cells, lysates were panned against 
the D32.3 9 antibody and the control MAb344. Plasmids 
recovered from the antibody coated wells were transformed into 

30 E. coll and plated on Amp (100 /xg/ml) and Cam (20 jig/ml) 

plates for the determination of Amp/Cam plasmid ratios as a 
measure of enrichment. Enrichment was defined as the starting 
Amp/Cam plasmid ratio divided by the final (panning derived) 
Amp/Cam ratio. 

35 As expected, in three separate experiments, deletion 

of IacO s sites resulted in an average 44 6 fold enrichment drop 
to near background level for the lacl peptides-on-plasmids 
construct. For headpiece dimer mutant A4 . 5 constructs, 
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. however, deletion of the plasmid lac0 3 site had no significant 
effect on enrichment- An average 43 fold enrichment was 
obtained by the headpiece dimer mutant A4 . 5 constructs with 
IacO s and an average 44 fold was achieved without 2ac0 3 . This 
5 finding suggests that the headpiece dimer mutant A4 . 5 does not 
require binding to lacO s as a mechanism of linkage to its 
parent plasmid. This is consistent with observations made on 
mutants of full length lad which, upon substitution at 
position 18, lose lacO site binding specificity (Kleina & 

10 Miller, J*. Mol. Biol. 212, 295-318 (1990); Lehraing et al . , 
EMBO J\ 9, 615-621 (1990); Lehming et al . , EMBO J. 6, 
3145-3153 (1987) ) . 

To determine the plasmid binding site(s) of 
headpiece dimer mutant A4 . 5 and the non-mutant B7, several 

15 protein-DNA binding experiments were performed. Preliminary 
gel shift experiments with plasmid pCMG14 digested into small 
fragments combined with headpiece dimer A4 . 5 polypeptide, 
resulted in no visible shift for any of the plasmid fragments. 
Other experiments using 32 P-labeled plasmid fragments 

20 complexed with over-expressad headpiece dimer A4 . 5 , B7, and 

full-length lacl polypeptides, showed specific binding of lacl 
to lacO s -containing fragments, but failed to show any specific 
binding by the mutant (A4.5) and non-mutant (B7) headpiece 
dimers. Another experiment, in the absence of unlabelled 

25 herring DNA competitor, showed nonspecific binding to all of 
the plasmid fragments by lacl and both headpiece dimer 
isolates indicating that some degree of nonspecific DNA 
binding occurs for headpiece dimers and lacl alike. 

These result indicate that the mutant headpiece 

30 dimer, while offering improved performance over the lacl 

plasmid, surprisingly does not require a lacO binding site for 
use in the panning procedure. The in vitro binding data 
suggest that the mutant headpiece dimer may not show a strong 
preference for a specific sequence in the vector encoding the 

35 headpiece dimer. 

Although the mechanism and degree of sequence 
specificity, if any, by which the mutant headpiece binds to 
DNA are unclear, the power of the above methodology to self- 
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select optimal DNA binding proteins for use in screening 
peptides is evident. The method has self -selected a 
derivative DNA binding protein that has substantially 
different binding characteristics than the natural lad 
protein, but which offers improved enrichment compared with 
the lacl parent protein in selection of peptides having high 
affinity for a receptor. 

Example 6 
Standard Protocol 
This Example provides a standard protocol for the 
method of the present invention with any receptor that can be 
immobilized on a microtiter dish with an immobilizing 
antibody. 

(1) Reagents 

To practice the method, the following reagents will 
be helpful. 



20 



Items 



Vendor 
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30 



35 



BSA, fraction V, RIA grade 
BSA, protease free 

Bulk DNA, sonicated, phenol extracted 
Centriprep loo concentrator, 5-15 ml 
Chromatography column, G22X250 
Coomassie Plus protein assay reagent 
DTT 

EDTA, disodium, dihydrate 
Ethyl alcohol, 200 proof 
Glycerol 

Glycogen, molecular biology grade 
HEPES free acid 
Isopropanol, HPLC grade 
IPTG 

a-Lactose, monohydrate 
Lysozyme, from hen egg white 
Microtiter plate, Immulon 4, flat bottom Dynatech 
PBS Sigma 



Catalog # 



USB 10363 

US3 10S67 

Amicon 4308 

Amicon 95220 

Pierce 23236 

Sigma E-5134 

Gold Shield Chem. 

Sigma G-5516 

Boehringer 9 01 393 
Research Organics 6003H 



Aldrich 

Bachem 

Sigma 

Boehringer 



27, 049-0 
SISO10 
L-3625 
837 059 
011-010-3850 
1000-3 
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PMSF 

Phenol, equlibrated TT „ 

Ui£i 20072 
Phenol; chloroform :lsoamyl alcohol USB 

Potassium hydroxide solution, 8.0 N Sigma 

Potassium chloride 

Sodium chloride C i^, 

Sigma S-3014 
Sephacryl S-400, high resolution Pharmacia 
Tubes w/ screw cap, 13 ml Sarstedt 



20081 
17-8 

Sigma p_g 541 



17-0609-01 
60.540 



The various buffers and other preparations referred 
to m the protocol are shown below. 

HE buffer is prepared at pH = 7.5 by adding 8.34 g of 
HEPES, free acid (use a better grade than Sigma 's standard; 
the final concentration is 35 mM) , to 200 M l of 0.5 M EDTA, 
15 P H 8.0 (final concentration is 0.1 mM) and adding water to'a 
final volume of l L. The pH is adjusted with KOH. 

HEK buffer is identical to HE buffer but also 
contains KCl at a final concentration of 50 mM. 

HEKL buffer is identical to HEK buffer but also 
contains alpha-lactose, which may require warming to go into 
solution, at a final concentration of 0.2 M. 

Lysis buffer (5 ml) is prepared by mixing 4.2 ml of 
HE buffer with 1 ml of 50% glycerol, 750 ^ of protease free 
BSA at 10 mg/ml in PBS, 10 M l of 0.5 M DTT, and 12.5 M l of 
25 0.1 M pMSF in isopropanol. 

HEK/BSA buffer is prepared by dissolving 5 g of 1% 
BSA, fraction V, in 500 ml of HEK buffer. 

WTEK buffer is prepared at pH = 7.5 by adding 7.53 g 
of Tris, pH = 7.5 (final concentration of 50 mM) , to 20 ml of 
0.5 M EDTA (final concentration of 10 mM) and 7.45 g of KCl 
(final concentration of 100 mM) and adding water to a final 
volume of l L. 

TEK buffer is prepared at pH = 7.5 by adding 1.51 g 
of Tris, pH = 7.5 (final concentration of 10 mM) , to 200 /xl of 
0.5 M EDTA (final concentration of 0.1 mM) and 7.45 g of KCl 
(final concentration of 100 mM) and adding water to a final 
volume of 1 L. 
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(2) Bacterial Strains 

3. coll ARI 439 (lon-11 sulAl hsdR17 A(ompT-fepC) 
AclpA319: ikon AlacX IacZU118 A(srl-recA) 306 : :TnlO) was used 
for random dodecamer panning. E. coll ARI 814 (A(srl-recA) 
5 endAl nupG lon-11 sulAl hsdRll A(ompT-fepC) AcIpA319 : zkan 

AlacX lacZUHS) was used for panning lacO s deletion variants 
and to isolate DNA for sequencing. 

The various mutations in strain ARI 814 are designed to 
enhance various aspects of panning as described below. It was 

10 constructed in 11 steps starting with an E. coll B strain from 
the E. coll Genetic Stock Center at Yale University (-ST. coll 
B/r, stock center designation CGSC6573) with genotype lon-11 
sulAl. This strain was chosen as a starting point because of 
its robust growth properties and because it yields excellent 

15 electrocompetent cells, which are essential for construction 
of large libraries and for the maintenance of clone diversity 
during panning. In spite of considerable genetic 
manipulation, the strain maintained these favorable growth and 
transformation properties through the construction process. 

20 The strain contains the hsdR17 allele from strain 

MC10 61 which prevents restriction of unmodified DNA introduced 
by transformation or transduction. This mutation helps 
maintain library diversity and simplified further construction 
steps. The ojnpT-fepC deletion from strain UT5600 removes the 

25 gene encoding the oinpT protease, which digests peptides 

between paired basic residues. This protease is extremely 
active in cell lysates and would potentially have been a major 
limitation on the diversity of peptides in a random library. 
The lon-11 and clpA mutations also limit proteolysis because 

3 0 they prevent expression of ATP-dependent , cytoplasmic 
proteases. The sulAl allele suppresses a deleterious 
f ilamentation phenotype often caused by Ion mutations. 

ARI 814 also contains a deletion of the lad gene to 
prevent expression of wild-type lac repressor, which would 

35 compete with the fusion constructs for binding to the lacO 

sites on the plasmid. The lacZ mutation prevents waste of the 
cell's metabolic resources making jS-galactosidase due to 
absence of the repressor- The endal mutation knocks out 
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expression of a nuclease that has two deleterious effects on 
panning- First, it could digest plasraids in the crude cell 
lysate used for panning, reducing the number of recoverable 
complexes. Second, it lowers the quality of DNA preparations 
5 used for cloning or sequencing. Finally, the ARI 814 strain 
contains a recA deletion to prevent multimerization of 
plasmids through recA-catalyzed homologous recombination. 

ARI 814 is prepared for use in electroporation 
essentially as described by Dower, supra, except that 10% 

10 glycerol is used for all wash steps. The cells are tested for 
efficiency using 1 pg of a pBluescript plasmid (Stratagene) . 
Cells routinely yield transformation frequencies of 2 x 10 10 
colonies per fig of DNA. These cells are used for growth of 
the original library and for amplification of the enriched 

15 population after each round of panning. 

(3 ) Library construction 

The interrupted palindrome Sfll sites in pJS142 allow 
efficient cloning of library oligos because they greatly 

20 minimize undesired legation events. Only the correct 
orientation of the annealed library oligos can ligate 
efficiently into the vector. In addition, once the Sfll 
digested vector is purified away from the small internal 
"stuffer" fragment, the vector ends cannot legate to each 

25 other because of incompatible sticky ends. Libraries 

routinely have greater than 10 s independent clones per y.g of 
vector used in the ligation. 

Vector fragment for library construction can be 
purified from the stuffer fragment by either of two methods. 

30 For small scale (5-10 /ig) library construction, pJS142 is 

digested with Sfll and then with Eagl (to reduce background) 
and electrophoresed on an agarose gel. The vector fragment 
can be eluted from the gel using the Geneclean kit (Bio 101) . 
For larger scale preparations, potassium acetate gradients are 

35 used to purify vector fragment. 

a. Procedure for Purification of Vector for Library 
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1. Digest 200 ^g of pJS142 DNA to completion in 
1 mi final volume with Sfll followed by SagT. 

2. In a 1/2" x 2" ultraclear centrifuge tube, 
carefully layer 5%, 10%, is*, and 20% potassium acetate' 
solutions containing 1 raM EDTA and 2 ng/ml ethidium bromide, 
using 1 ml of each. 

3. Layer 1 ml of the digest on top of the gradient. 
Centrifuge at 48,000 rpm for 3 hrs in a Beckman SW50.1 rotor. 
The large vector fragment will migrate to a position -2/3 of 
the distance from the top of the gradient as visualized with a 
long wave UV source. The small stuffer fragment remains at 
the top of the gradient while undigested supercoiled DNA forms 
a pellet on the bottom of the tube. 

4. Puncture the tube with an 18 g syringe needle 
attached to a 3 ml syringe and extract the fragment (-0.5 to 
1.0 ml) . 

5. Remove the ethidium by extracting five times 
with an equal volume of water saturated 1-butanol. 

6. Transfer to a microfuge tube, add 1/10 volume 
5 M NaCl, and then an equal volume of isopropanol. Centrifuge 
at top speed for 10 min, pour off the liquid, and wash once • 
with 80% ethanol. 

7. Resuspend the pellet in water or TE and 
determine the concentration by reading A 250 . The yield from 

25 the gradient is usually about 40% of the input amount. 

b - Procedu re for Library Construction 
Three oligos are needed for library construction, 
ON-829 (5' ACC ACC TCC GG) , ON-830 <5» TTA CTT AGT TA) , and a 
library specific oligo of sequence (5 ! GA GGT GGT {NNK} n TAA 
CTA AGT AAA GC [SEQ ID NOS;85 and 86]), where {NNK} n denotes a 
random region of the desired length and sequence. The oligos 
can be 5 1 -phosphorylated chemically during synthesis or after 
purification with polynucleotide kinase. They are then 
35 annealed at a 1:1:1 molar ratio and ligated to the vector. 
Note that the melting temperature of the annealed oligo 
complex is quite low, so the final annealed mixture should 
: never be warmed above the 14 °C ligation temperature. 
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1. Mix phosphorylated ON-829, ON-330, and the 
library aligo (50 pm each), 1 }il 5 M NaCl, 2.5 |il l m Tris, 
pH 7.4, and- dH 2 0 to bring the total volume to 50 pi. 

2. Heat to 70 °C for 5 min in a temp block and then 
turn off the block and allow the mixture to cool slowly to 
around 3 0 °C. Move the whole temp block into a 4° room or 
refrigerator and allow it to cool to below 10 °, then move the 
samples onto ice. 

3. Mix on ice: 5 Mg (1.3 picomole) pJS142 fragment, 
2.6 ^1 (2. 6 picomole) annealing mix, 25 jul lOx ligase buffer, 
dH 2 0 to 250 /il/ mix, then add 2 m1 (800 NEB cohesive end 
units) T4 ligase. In parallel, set up a 1/10 scale no oligo 
control to check for background. Incubate at 14 °C for 

12-24 hours. 

4. Heat to 65°C, 10 min to inactivate the ligase. 
Add 2 Ml 25 mM DNTP mixture (Pharmacia), 1 m! (13 units) 
Sequenase 2.0 (US Biochemicals) . Add 1/10 amounts to the 
control legation determine ligation efficiency compared to the 
control . 

5. Add 250 Ml H 2 0, 55 fil 5 M NaCl to the library. 
Extract with 300 Ml phenol/CHCl3 , spin 3 min, and move 500 m! 
of the aqueous phase to a new microfuge tube. 

6. Add 1 Ml 20 mg/ml glycogen (Boehringer Mannheim 
molecular biology grade) and 500 Ml isopropanol. Mix well and 
spin in microfuge at top speed for 10 min. 

7. Pour off the liquid, close the tube, and spin 
briefly. Use a fine bore pipet tip to remove the last traces 
of liquid without disturbing the pellet. Wash the pellet with 
500 Ml of 4° 80% ethanol, spin 2 min. Pour off the liquid, 
close the tube, and spin briefly. Use a fine bore pipet tip 
to remove the last traces of liquid. This careful washing 
procedure is important to remove all traces of salt to prevent 
problems during the electroporation step. 

8. Resuspend the pellet in 4 m1 dH 2 0. Store at 
-20° until ready for amplification. 
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(4) Screening 

The library can be screened over a two-day period as 

follows. 

Day 1 

1. Coat two sets of 12 micratiter wells with the 
appropriate amount of immobilizing antibody in 100 /xl of fbs, 
for panning and negative control; let the coated plate 
incubate at 37°C for 1 hr. Consider using all 24 wells as 
"plus receptor" wells in the first round, i.e., no negative 
control in the first round. 

2. Wash the plate four times (4x) with HEK/BSA. 

3. Block wells by adding 200 pi of HEK/BSA to each 
well; let the plate incubate at 3 7 a C for 1 hr . 

15 4. Wash the plate 4x with HEK/BSA. 

5. Dilute the receptor preparation in cold HEK/BSA 
(or appropriate binding buffer) "as necessary. 

6. Add the diluted receptor preparation to the 
wells at 100 p.1 per well; let the plate incubate at 4°C for 

2 0 1 hr. with agitation. 

7. Wash the plate 2x with cold HEK/BSA. 

3. Add 100 pi of 0.1 mg/ml bulk DNA in HEK/BSA to 
each well; incubate the plate at 4°C for at least 10 minutes. 

On day 1, steps A-0 can also be carried out. Note 
that the column separation (steps A and H-O is optional) . If 
the column separation is omitted, the lysates from step G are 
added directly to the wells. 

A. Begin equilibrating column 22 mm diameter x 
22 cm height of Sephacryl S-400) with cold HEKL (~lhr, flow 
rate is set to collect 5 ml fractions every 2 to 3 minutes) . 

B. Prepare 1 ml of lysozyme at 10 mg/ml in cold HE. 
C Thaw and combine sub-libraries (2 ml total 

volume) in a 13 ml Sarstedt screw cap tube. 

D. Add 6 ml of lysis buffer and 150 pi lysozyme 
solution (Boehringer lysozyme is preferred over Sigma 
lysozyme) ; mix by inverting gently; and incubate on ice for 
5 minutes, although less time is often satisfactory. 
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E. Add 2 ml of 20% lactose (lad libraries only) 
and 250 ill of 2 M KC1 (200 pi, for headpiece dimer libraries), 
and mix by inverting gently. 

F. Spin at 14.5 K for 15 minutes. 

5 G. Transfer supernatant by pipetting into a new 

tube. 

H. Load raw lysate onto the equilibrated column. 

I. After lysate is loaded, collect ten 5 ml 
fractions . 

10 J- Perform the coomassie protein assay as follows: 

(1) to 10 microtiter wells, add 100 /il of coomassie reagent 
and 20 /j1 from each fraction, and mix; (2) select 4 
consecutive fractions which correspond to 1 brown and 3 blue 
wells from the assay (light blue counts as blue) ♦ 
15 K . Combine selected fractions in a CentripreplOO . 

Two centripreps may be used to speed up the process. The 
maximum capacity of each centriprep is about 15 mi- 
ll. Spin in J-63 centrifuge at 1500 rpm. 
M. Rinse the column with cold HEK for 1 hr. 
20 N. Empty liquid from the inner chamber every 

15 minutes until final volume < 2 ml ("1 hr.). 

O. Determine lysate volume, and remove 1% as "Pre" 
sample; keep Pre sample on ice. 

Returning to the numbered steps, one proceeds as 

25 follows. 

9. Wash plate 2x with cold HEK/BSA. 

10. Bring the volume of the concentrated lysate up 
to 2400 [il by adding HEXL/BSA; add bulk DNA to a final 
concentration of 0.1 mg/ml. The activity of the receptor in 

30 this buffer should be verified. 

11. Add lysate at 100 pi per well; incubate the 
plate at 4°C for 1 hr. with agitation. 

12. Wash plate 4x with cold HEKL/BSA. 

13. Add 100 /xl of 0.1 mg/ml bulk DNA in HEKL/BSA to 
35 each well; incubate at 4°C for 30 minutes with agitation. 

14. Wash plate 4x with cold HEKL. 

15. Quickly wash plate lx with cold HEK. 
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16. Elute by adding to each well 50 fil 10 mM Tris 
pH 8, 1 mM EDTA, 0.5 M NaCl, then add 50 /il equilbrated 
phenol, and agitate for 5 min. 

17. Remove all eluants; centrifuge to separate 
phases,, remove acqueous phase to a new tube. 

18. Add one-tenth volume of 5 M NaCl and 1 p.1 of 
2 0 mg/ml glycogen as carrier. 

19. Precipitate plasmids in equal volume of 
isopropanol at room temperature. 

20. Spin 10 minutes; carefully remove supernatant, 
spin again, and remove remaining supernatant. 

21. Wash with 200 fil of cold 70% EtOH. 

22. Spin and remove traces of supernatant as above. 

23. Resuspend plasmids in water (suggested volumes: 
100 ^il for Pre; and 4 fil each for the panning and negative 
control wells; use more than 4 /il for panning and negative 
control samples in later rounds to retain as backups) . 

Day 2 

24. Chill 4 sterile 0.2 cm electrode gap cuvettes on 
ice. The panning sample is divided equally into 2 cuvettes to 
prevent complete loss of sample during electroporation . 

25. To three 16 ml sterile culture tubes, add 1 ml 
SOC medium (2% Bacto-Tryptone, 0.5% Bacto-yeast extract, 10 mM 
NaCl, 2.5 mM KC1 , 10 mM MgCl 2 , 10 mM MgS0 4 , and 2 0 mM Glucose) 
to two tubes and 2 ml to one tube. Label the two 1 ml tubes 
as "Pre" and "NC" (for "negative control") , and label the 2 ml 
tube as "Pan" (for "panning") . 

26. Thaw 200 jjlI of high efficiency electro-competent 

cells. 

27. Transfer 40 pi aliquats of cells to 4 chilled 
sterile eppendorf tubes; incubate the tubes on ice. 

28. Add 2 £il of each plasmid to each tube and mix 

gently . 

29. Transfer cells/plasmids mixtures into their 
corresponding cuvettes;, keep the cuvettes on ice. 

30. Set the Gene Pulser apparatus to 2.5 3cV, 25 jLtF 
capacity, and set the Pulser Controller unit to 200 ohms. 
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31. Apply one pulse (time constant = 4-5 msec) . 

32. Immediately add the room temperature SOC medium 
to resuspend cells in the cuvette. 

33. Transfer cell suspension back to the culture 

tube. 

34. Incubate the culture tube at 37 °C for 1 hr. with 
agitation . 

35. To 200 ml of LB broth prewarmed to 37°C / add 
0.4 ml of 50 mg/ml ampicillin. 

36. Remove 10 to 100 pi of the "Pan" library culture 
for plating, and transfer the rest (2 ml) to the prewarmed LB 
broth. Plate out several dilutions of each sample on LB 
plates containing ampicillin. Suggested plate dilutions are 
as follows; Pre — io~ 5 , io" 6 and 10" 7 / and Pan/NC — 10" 3 , 
10~ 4 , io~ 5 and 10~ 6 . 

37. Grow "Pan" library at 37°C for about 4-5 hr. 
until the OD 600 = 0.5-1.0. 

3S. Chill the flask rapidly in ice water for at 
least 10 -minutes. 

39. Centrifuge cells in 250 ml sterile bottle at 6K 
for 6 minutes, Backman JA-14 rotor. 

40. Wash by vortexing cells in 100 ml of cold WTSK. 

41. Centrifuge at 6K for 6 minutes. 

42. Wash by vortexing cells in 50ml cold TEX. 

43. Centrifuge at 6K for 6 minutes. 

44. Resuspend cells in 4 ml of HEK and store in two 
2 ml vials at -70°C. Use one tube for the next round; keep 
the other as a backup. 

( 5 ) Examination of Individual Clones bv ELISA 
The binding properties of the peptides encoded by 
individual clones are typically examined after 3, 4, or 5 
rounds of panning, depending on the enrichment numbers 
observed. The most sensitive assay is an ELISA that detects 
receptor specific binding by lacl-peptide fusion proteins. 
The lad ELISA can detect binding of peptides that have 
monovalent affinities for the receptor as low as -100 juM. 
This sensitivity of the assay is an advantage in that initial 
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hits of low affinity can be easily identified, but is a 
disadvantage in that the signal in the ELISA is not correlated 
with the intrinsic affinity of the peptides. Fusion of the 
peptides to the maltose binding protein (MB?) as described 
below permits testing in a ELISA where signal strength is 
better correlated with affinity. 

a - Reagents for lvg3fp g 

• Lysis Buffer (make fresh just before use) 
42 ml HE 
5 ml 50% glycerol 

3 ml io mg/ml BSA, protease free, in HE 

125 Ml 0.1 M PMSF (may include other 
protease inhibitors) 
750 Ml 10 mg/ml lysozyme in HE 

• 20% L-arabinose in dH 2 0, sterile (Important: do not 
use D-arabinose) 

b * Procedure for t he Preparation of lad KT.TSA 

Lvsates 

1. Inoculate each individual clone in 1 ml LB -Amp , 
shake at 3 7°c, overnight. 

2. Dilute 3 00 Ml of the culture into 3 ml LB -Amp , 
shake at 37°c for 1 hr. 

3. Induce with 33 Ml of 20% L-arabinose (0.2% 
final), shake at 37°C for 2-3 hrs. 

4. Spin at 4,000 rpm, 5 min, Beckman JS 4.2 rotor. 

5. Decant supernatant, keep cells on ice or at 4°C 
for the rest of the procedure. 

6. Vortex to resuspend cells in 3 ml 4°C WTEK 

buffer. 

7. Spin 4,000 rpm, 5 min, pour off supernatant. 

8. Vortex to resuspend cells in 1 ml 4°C TEK 
buffer; transfer to 1.5 ml microfuge tubes. 

9. Spin 14,000 rpm, 2 min, aspirate supernatant. 

10. Resuspend cells in 1 ml lysis buffer, incubate 
on ice, l hr. 
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11. Add 110 fil 2 M KC1 (final concentration of 
0.2 M) to solubilize fusion proteins, invert to mix. Note 
that most of the lad protein will be present as insoluble 
inclusion bodies that will be part of the pellet discarded in 

5 step 13. Enough lad protein is soluble to allow a strong 
signal in the ELI S A . The KC1 helps increase the amount of 
soluble lad. 

12. Spin 14 , 000 rpm, 15 min, 4°C in a microfuge. 

13. Transfer -900 /*1 of the clear crude lysate to a 
10 new tube. (Store at -70 if assay is to be done on another 

day.) 

c Reagents for ELI5A 

• PBT: PBS, 1% BSA , 0.05% Tween-20 

• PBS/Tween: PBS , 0.05% Tween-20 

• Anti-lacI antibody; Rabbit anti-lacl polyclonal can 
be purchased from Stratagene (#217449) . 

• Goat anti-Rabbit IgG and light chains, alkaline 
phosphatase conjugate is from Tago (#6500) . 

• Alkaline phosphatase substrate is p-nitrophenyl 
phosphate. 

• Development buffer: 9.6% diethanolamine, 0.24 mM 
MgCl 2 , pH 9.8 with HC1. 

25 d - Procedure for lacl ELISA 

1- Coat microtiter wells with the receptor of 
interest. Make equivalent set of minus receptor control wells 
in parallel. Block wells for at least l hr with 1% BSA. The 
control wells should be as similar as possible to the receptor 
coated wells to control for various sorts of nonspecific 
binding by the peptides. The assay is usually performed in 
duplicate or triplicate wells. 

2. Wash plate 4x with 4°C PBS/Tween. 

3. Add 100 Ml/well crude lysate diluted 1/20 in 
PBT; 4°C, 30 min, shake gently. 

4. Wash plate 4 x with 4°C PBS/Tween. 

5. Add 100 Ml/well anti-lacl Antibody diluted 
1/15,000 in PBT; 4°C, 30 min, shake gently. The dilution of 
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anti-lad given here is based on our titration of our own 
serum. It may be necessary to use a different dilution of the 
commercially available serum. 

6. Wash plate 4x with 4°C PBS/Tween. 

7. Add 100 Ml/well goat anti-rabbit alkaline 
phosphatase conjugated Ab diluted 1/3,000 in PBT; 4°c, 30 min 
shake gently. 

.8. Wash plate 4x with 4°C PBS/Tween. 

9. Wash plate 2x with 4°C TBS (10 mM Tris pH 7.5, 
150 mM NaCl) . 

10. Develop assay, @ 200 jul/well of l mg/ml alkaline 
phosphatase substrate in development buffer. 

11. Read plate at A 405 in microtiter plate reader. 
(Take time point measurements to determine termination time. 
Reaction is no longer linear above A 405 ~l.o.) 

12. Stop reaction with 50 jLtl/well 2 M NaOH and read 
final result. 

e - Transfer of selected sequences to maltose 
bindi ng protein f MBP) 

Coding sequences of interesting single clones or 
populations of clones are often transferred to vectors that 
fuse those sequences in frame with the gene encoding MBP. 
This is done for several reasons. First, MBP. generally exists 
in solution as a monomer and the native protein has no 
cysteine residues. The monovalency of peptide display allowed 
by MBP fusions causes the MBP ELISA described below to be much 
more affinity sensitive than the lad ELISA. Dimers forms 
have been reported for MBP purified under certain conditions. 
These dimers can be dissociated by the addition of maltose to 
the solution. No substantial difference in the MBP ELISA 
signal is seen in the presence and absence of 1 mM maltose 
using the protocols listed here, so dimer formation under our 
conditions appears unlikely. 

The second reason for using MBP is that it can be 
expressed in very large amounts as a soluble protein which is 
easily purified, allowing initial examination of the 
properties of peptides without the need for chemical 
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synthesis. Third, the MBP fusion proteins can be directed to 
either the cytoplasm (a reducing environment) or the periplasm 
(an oxidizing environment) of e. coli using vectors that 
differ only by the presence or absence of an N-terminal signal 
sequence in the gene encoding MBP. Some peptides are 
expressed more efficiently in one or the other of these two 
environments. Fourth, peptide populations linked to MBP can 
be easily screened using colony lifts with a selected 
receptor . 

The cloning of a library into pJS14 2 creates a BspEl 
restriction site near the beginning of the random coding 
region of the library. Digestion with BspEI, Nhel and Seal 
allows the purification of a -900 bp DNA fragment that can be 
subcloned into one of two vectors, pELM3 (cytoplasmic) or 
15 pElkM15 (periplasmic) , which are simple modifications of the 
pMALc2 and pMALp2 vectors, respectively, available 
commercially from New England Biolabs. Digestion of pELM3 and 
pELM15 with Agel and Seal allows efficient cloning of the 
BspEI-Scal fragment from the pJS14 2 library. The BspEI and 
20 Age! ends are compatible for ligation. In addition, correct 
ligation of the Seal sites is essential to recreate a 
functional bla (Amp resistance) gene, thus lowering the level 
of background clones from undesired ligation events. 
Expression of the tac promoter-driven MBP-peptide fusions can 
25 then be induced with IPTG. 

f • Procedure for Subclonina into MBP Vectors 

1. Digest pELM3 or pELM15 with Agal and Seal. 
Purify the 5 . 6 kb fragment away from the 1.0 kb fragment. The 

30 digest is generally run in a 0.7% agarose gel, and the 

appropriate region of the ethidium bromide stained gel excised 
under low-intensity long wave UV illumination, and run on a 
new gel. Electrophoresis in the second gel yields an 
additional purification of the desired fragment and leads to 

35 lower background in the ligation. Elute the DNA from the gel 
fragment using a Geneclean kit (Bio 101) . 

2. Remove a 5-50 ml portion from the 200 ml PAN 
amplification culture before harvesting the cells. Allow the 
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removed portion to grow to saturation overnight. Prepare DNA 
from the cells and digest with BspEI and Seal. Purify the 
0.9 kb BspET-Scal fragment from the 3,1 and 1.7 kb vector 
fragments as described above. 

3. Ligate an equimolar mix of the two fragments at 
a final DNA concentration of -50 ^g/ml with T4 DNA ligase in 
standard ligase buffer containing 0.4 mM ATP (the higher 
levels of ATP found in most ligase buffers inhibit efficient 
ligation of the Seal blunt ends). Incubate at 14 °C overnight. 

4. Inactivate ligase at 65 °C for 10 min. To lower 
background from religation of the parental vector, digest the 
ligation mix with Xbal. Isopropanol-precipitate the ligation 
mix using 1 ^1 of glycogen as carrier, wash carefully with 80% 
ethanol, and resuspend the dry pellet in 2 0 ^1 dH 2 0. 
Transform ARI 814 with 1 pi, and plate on LB -Amp plates. ' 

9- Procedure for MB? ELISA 

The cell lysates for the MBP ELISA are prepared by 
the same procedure as the lad ELISA lysates, except that the 
induction is done with a final concentration of 0.3 mM TG 
instead of L-arabinose. The ELISA is performed as described 
for lad above with the following exceptions: 

1. Lysates are diluted 1/50 for addition to the 

wells. 

2. Primary antibody is 1/10,000 diluted polyclonal 
rabbit anti-MBP (available from New England Biolabs) . 
Incubation is for 15 instead of 3 0 min. 

3 . The secondary antibody incubation is also for 15 
instead of 3 0 min. 

4. Development of the assay generally takes longer 
than the lacl ELISA, generally 3 0-60 min. 

Although the foregoing invention has been described 
in some detail by way of illustration and example for purposes 
of clarity of understanding, it will be apparent that certain 
changes and modifications may be practiced within the scope of 
the appended claims. 




WO9S/40987 PCT/US96/09809 

81 

The cell lines described in the application as having been 
deposited at the ATCC will be maintained at an authorized 
depository and replaced in the event of mutation, nonviability 
or destruction for a period of at least five years after the 

5 most recent request for release of a sample was received by 
the depository, for a period of at least thirty years after 
the date of the deposit, or during the enforceable life of the 
related patent, whichever period is longest. All restrictions 
on the availability to the public of the cell lines will be 

0 irrevocably removed upon the issuance of a patent from the 
above-captioned application. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) 



APPLICANT: Schatz, Peter J. 



Cull, Millard G. 
Miller, Jeff F. 



Stemmer, Willem P.C. 
Gates, Christian M. 



(ii) TITLE OF INVENTION: Peptide Library and Screening Method 
(iii) NUMBER OF SEQUENCES: 162 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: William M. Smith 

( B) STREET: One Market Plaza, Steuart Tower, Suite 200Q 

(C) CITY: San Francisco 

(D) STATE: California 

(E) COUNTRY: USA 

(F) ZIP: 94105 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
(3) COMPUTER: IBM ?C~ compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 03/548,540 

(B) FILING DATE: 25-OCT-1995 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/290,641 

(B) FILING DATE: 15-AUG-1994 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/963,321 

(B) FILING DATE: 15-OCT-1992 

(viii) ATTORNEY / AGENT INFORMATION: 

(A) NAME: Smith, William M. 

(B) REGISTRATION NUMBER: 30,223 

(C) REFERENCE/DOCKET NUMBER: 16528J-001240US 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 415-326-2400 

(B) TELEFAX: 415-326-2422 
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(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS : single 
{ D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

Gly Ala Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 54 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
GTGGCGCCNN XNNXNNKNNK NNKNNKNNXN NKNNKNNKNN KNNKTAAGGT CTCG 54 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 11 base Dairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
TGCCACCGCG G 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



1 



5 



10 



ATTCCAGAGC TCGA 



14 
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(2) INFORMATION FOR SEQ ID HO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 
(3) TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Leu Glu Ser Gly Gin Gly Ala Asp Gly Ala 
1 5 io 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
CTCGAGAGCG GGCAGGGGGC CGACGGGGCC TAATTAATTA AGCTT 



(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE' CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii} IMMEDIATE SOURCE: 

(3) CLONE: dynB 1.0 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Tyr Gly Gly Phe Leu Arg Arg Gin Phe Lys Val Val Thr 
1 5 io 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 21 4 1.2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



Thr 
1 



Gly Lys Arg Gly Phe Lys Val Val Cys Asn Ser 
5 10 
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(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE z 22 4 1.2 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Lys Arg Asn Phe Lys Val Val Gly Ser Pro Cys Gly 
15 10 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 10 4 0.3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 0 : 

Ser Asp Ser Gly Asn Giy Leu Gly He Arg Arg Phe Lys Val Ser Sez 
15 10 15 

Leu Ala Val Leu Ala Aso Glu Arg Arg Phe Ser Ala 
20 25 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 30 4 0.9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Gly Thr Arg Pro Phe Lys Val Ser Glu Tyr He Leu 
15 10 
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(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 35 4 0.2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Ser Leu Lys Asp Glu Asn Asn Lys Arg Arg lie Phe Lys Val Ser Ser 

io 15 

Leu Ala Val Leu Ala Asp Glu Arg Arg Phe Ser Ala 
20 25 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 57 3 0.9 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 13; 

Ser Tyr Leu Arg Arg Glu Phe Lys Val Ser Gly Val 
5 10 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 24 4 0.9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Gly Trp Arg Ser Cys Pro Arg Gin Phe Lys Val Thr 
5 10 
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(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 45 3 0.9 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

lie Lys Arg Gly Phe Lys He Thr Ser Ala Met Ser 
15 10 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 47 3 0.3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: 

Val Arg Phe He Ala Arg Pro Phe Arg He Thr Gly 
15 10 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS x single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 71 2 1.1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Ala Arg Ala Phe Arg Val Thr Arg He Ala Gly Val 
15 10 



• 
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(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : 3ingie 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

{vii) IMMEDIATE SOURCE: 

(B) CLONE: 74 2 0.2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Lys Asn Glu Thr Arg Arg Pro Phe Arg Gin' Thr Ala 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 63 2 0.6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Val Asn His Arg Arg Phe Ser Val Val His Ser Tyr 
IS 10 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOFOLOGY: Linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 48 3 0.4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

Val Ser Ser Ser Arg Thr Phe Asn Val Thr Arg Arg 
IS 10 
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(2) INFORMATION FOR SEQ ID NO: 21; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : single 

( D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 46 3 Q.3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Gly Arg Ser Phe His Val Thr Ser Phe Gly Ser Val 
15 10 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 4 4 1.1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Arg Ser Thr Thr Val Arg Gin His Lys Val Val Gly 
15 10 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(H) TYPE: amino acid 

(C) STRANDEDNESS: single 
{ D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 15 4 1.2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Glu Arg Pro Asn Arg Leu His Lys Val Val His Ala 
15 10 
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(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 73 2 0.5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Trp Gin Asn Arg Thr His Lys Val Val Ser Gly Arg 
15 10 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 7 amino acids 
( 3 ) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 73 2 1.1 

(xi) SEQUENCE * DESCRIPTION : SEQ ID NO:25: 

Ala Arg Lys His Lys Val Thr 
1 5 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 40 3 1.1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Arg Gin Val Thr Arg Leu His Lys Val lie His 
15 10 
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(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 11 4 1.0 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Cys Pro Gly Glu Arg Met His Lys Ala Val Arg Ala 
15 10 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
{ B ) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 2 4 1.0 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

Ser Arg Cys Arg Asn His Arg Val Val Thr Sar Gin 
15 10 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
( 3 ) TYPE: amino acid 

(CJ STRANDEDNESS: single 
{ D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 26 4 0.8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Asn Asp Gly Arg Pro His Arg Val Val Arg Cys Gly 
15 10 
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(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 9 4 0.3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:30: 

Glu He Arg Arg His Arg Val Thr Glu Arg Val Asp 
1 S io 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

( C ) STRAND EDNESS : s ing 1 e 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

( 3 ) CLONE : 56 3 1.1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: 

Leu Arg Arg Leu His Arg Val Thr Asn Thr Met Thr 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDSDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 69 2 1.1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Val Lys Gin Arg Leu His Ser Val Val Arg Pro Gly 



1 



5 



10 



1 



5 



10 
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(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 7 4 1-1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Val Thr Gin Arg Val Arg Ser Asn Lys Val Val Ser 
15 10 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 20 4 1.1 

(xi) SEQUENCE DESCRIPTION: SZQ ID NO:34: 

His Val Glu Lys lie Lys Arg Leu Asn Lys Val Val 
15 10 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 23 4 1.2 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 

Arg Leu Lys Thr Arg Leu Asn Lys Val Val Met Asp 
15 10 
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(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : single 
(D } TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 63 2 0.4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 

Val Arg Met Asn Ly3 Val Val Cys Glu Lys Leu Trp 
15 10 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

( D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 49 3 0.3 

(xi) SEQUENCE DESCRIPTION: SSQ ID NO: 37: 

Asp Leu Lys Arg Leu Asn Arg Val Val Gly His 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 19 4 0.3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

Arg lie Arg Asn Asn Lys Val lie Ala Arg Pro Val 
15 10 
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(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 36 4 0.5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:39: 

Ser Arg Val Arg Ser Asn Lys Val He Met Ser lie 
15 10 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amine acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 77 2 0.6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Ser Cys Arg Leu Asn Lys Val He Ala Arg Pro Val 
15 10 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

( B ) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE : 33 4 0.5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

Arg Ala Leu Ser Lys Asp Arg Leu Asn Lys Val Thr 



1 



5 
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(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 12 amino acid3 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

<B) CLONE: 58 3 1.1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Cys Thr Thr Glu Arg Ser Arg Gin Trp Lys Val Thr 
15 10 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 16 4 1.1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

Ala Arg Pro Trp Lys lie Thr Arg Asn Glu Pro Gly 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 72 2 0.3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: 

Gly Val Ser Glu Cys Arg Lys Trp Lys lie Val Gin 



1 
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10 
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(2) INFORMATION FOR SEQ ID NO: 45; 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 6 4 1.2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

Thr Thr Leu Arg Arg Tyr Lys Val Thr Gly Glu Arg 
b 10 

(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS - 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE : 

(3) CLONE: 34 4*1.1 

<*i> SEQUENCE DESCRIPTION: SEQ ID NO:46: 
lie Ala Asp Arg Arg Pro Tyr Arg Val Thr Arg prQ 
5 10 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERIST T CS * 

(A) LENGTH: 12 anuno~acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 76 2 1.2 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO:47: 

Ala Gly Lys Val Leu Arg Ala Tyr Lys He Val Glu 
5 10 
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(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS • 

(A) . LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 8 4 1.0 

(Xi) S ««»CS DESCRIPTION: SEQ ID NO: 48: 
Gin Lys Arg Leu Met Lys Val Ile phe Glu ^ 
3 10 * 

(2) INFORMATION FOR SEQ ID NO:4 9: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 12 amino acids 
(BJ TYPE: amino acid 
<C> STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE* 

(B) CLONE: 55 3*1.0 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:49 : 

«u Val Pro His Arg P he Arg Trp Thr Lys His Met 



PCT/US96/09809 
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(2) INFORMATION FOR SEQ ID NO: 50: 

U) SEQUENCE CHARACTERISES* 

(A J LENGTH: 24 amino "acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: sinqle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 13 4 0.1 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO :50: 

Ser Tnr T hr Glu Arg Arg Ser P he Lys Val Ser Ser Leu Ala Val Leu 

io 15 

Ala Asp Glu Arg Arg Phe Ser Ala 
20 



• 
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(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE; amino acid 

(C) STRAND EDNESS : single 
{ D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

{vii) IMMEDIATE SOURCE: 

(3 ) CLONE : 14 4 0.2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NOs51: 

Arg Leu Pro Gly Arg Met Phe Lys Val Ser Ser Leu Ala Val Leu Ala 
IS 10 15 

Asp Glu Arg Arg Phe Ser Ala 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRANpEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 23 4 0 . i 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Val Gly Ser Phe Lys Arg Thr Phe Lys Val Ser Cys 
15 10 



(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

( B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 29 4 0.1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Arg Gly Arg Met Phe Lys Val Ser Ser Leu Ala Val Leu Ala Asp Glu 



20 



1 



5 



10 



15 



Arg 



Arg Phe Ser Ala 
20 
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(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 29 aaino acids 

( B) TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 54 3 0.1 

£xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Pro Gly Arg Trp Val Arg Gly Val Gly He Arg Cys Phe Lys Val Ser 
15 10 15 

Ser Leu Ala Val Leu Ala Aso Glu Arg Arg Phe Ser Ala 
20 2S 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amine acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 60 2 0 . 1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

Arg Met Ser Arg Leu Phe Lys Val Ser Ser Leu Ala Val Leu Ala Astd 
1 5 10 15 

Glu Arg Arg Phe Ser Ala 
20 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 1 4 0.1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Pro Asp Val Leu Arg Ala Val Ala Thr Arg Gin His Lys Val Ser Ser 
15 10 15 

Leu Ala Val Leu Ala Asu Glu Arg Arg Phe Ser Ala 
20 " 25 




WO 96/40987 PCT/US96/09809 

101 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

( 5 ) CLONE : 27 4 0.2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57 : 

Arg Val Arg Gly His Arg Val Val Met Tyr Asn Glu 
15 10 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(3) CLONE: 64 2 0.1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Glu Cys Leu His Arg Arg Val His Lys lie Leu Ser 
15 10 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
( 3 ) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 61 2 0.1 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 59: 

Gly Leu Lys Cys Arg Pro Met Lys Val Asn Ala Asp 
15 10 
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(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
- (B) TYPE: amino acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 50 3 0.1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Arg His Arg Pro Phe Gly Trp Val Asn Lys Arg Ser 
15 10 



{2} INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 52 3 0.2 

(;<i) SEQUENCE DESCRIPTION: SEQ ID NO: SI: 

Ala Ala Arg Leu Phe Ser Gin lie Arg Arg Phe Pro 

1 5 - 10 



(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 53 3 0.1 

(xi) SEQUENCE DESCRIPTION: SSQ ID NO: 62: 

Arg Val Arg Trp His Met Val Thr Gly Asp Lys Gly 
15 10 




# 
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(2) INFORMATION FOR SEQ ID MO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 31 4 0.1 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Arg Phe Arg Asn Cys Ser He He Ser Ala Arg Gly 
15 10 



(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: 62 2 0.1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Tyr Gly Val Pro Arg He val Ala His Gin Leu Met 
15 10 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 5 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

Gly Ala Asp Gly Ala 
1 5 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Arg Gin Phe Lys Val Val Thr 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 67; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
{xi) SEQUENCE DESCRIPTION; SEQ ID NO: 67: 
Gly Lys Arg Xaa 



(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: S3 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
GCGGGCTAGC TAACTAATGG AG GAT AC AT A AATGAAACCA GTAACGTTAT ACG S3 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 
CGTTCCGAGC TCACTGCCCG CTCTCGAGTC GGGAAACCTG TCGTGC 46 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
CCTCCATATG AATTGTGAGC GCTCACAATT CGGTACAGCC CCATCCCACC C 



1 
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(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 base pairs 

(B) TYPE: nucleic acid' 

(C) STRANDEDNESS : single 

(D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
CG CCATCG AT CAATTGTGAG CGCTCACAAT TCAGGATGTG TGTGATGAAG A 51 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 72 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
TCGAGAGCGG GCAGGGGGCC GACGGGGCCT ACGGTGGTTT CCTGCGTCGT CAGTTCAAAG 60 
TTGTAACCTA AT 72 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 72 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 
CTAGATTAGG TTACAACTTT GAACTGACGA CGCAGGAAAC CACCGTAGGC CCCGTCGGCC 60 
CCCTGCCCGC TC 72 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
GGGCCTAATT AATTA 15 
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(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 22 baas pairs 
(3)' TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
AG CTTAATTA ATT AGG CC CC GT 22 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 
GTGGCGCCNN KNNKNNKNNK NNKNNXNNXN NKNNXNNKNN XNNKTAAGGT CTCG 54 

(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 11 base pairs 
(3) TYPE : nucleic acid 
(C) STRANDEDNESS: single 
( D } TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 
GGCGCCACCG T 11 

(2) INFORMATION FOR SEQ ID NO; 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pair3 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:78: 
AGCTCGAGAC CTTA 14 
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(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 79: 
TATTTGCACG GCGTCACACT T 21 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS i 

(A) LENGTH: 47 base pairs 

( B) TYPE: nucleic acid 

( C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
CCGCGCCTGG GCCCAGGGAA TGTAATTGAG CTCCGCCATC GCCGCTT 47 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 62 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 
CGATGGCGGA GCTCAATTAC ATTCCCNNKN NKNNXNNKNN KAAACCAGTA ACGTTATACG 60 
AT 62 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 59 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 
CGATGGCGGA GCTCAATTAC ATTCCCNNKN NKNNKNNKAA ACCAGTAACG TTATACGAT 59 
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(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 
(A). LENGTH: 72 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: S3:, 
CGCCCGCCAA GCTTAGGTTA CAACTTTGAA CTGACGMNNM NNMNNMNNGG GAATGTAATT 60 
CAGCTCCGCC AT 72 



(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 32 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 
GAATTCAATT GTGAGCGCTC ACAATTGAAT TC 32 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
ACCACCTCCG G 11 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

( B ) TYPE: nucleic acid 

(C) STRANDEDNESS: single - 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 



TTACTTAGTT A 



11 
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(2) INFORMATION FOR SEQ ID NO; 37: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 

( B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

Met Lys Pro Val Thr Leu Tyr Asp Val Ala Glu Tyr Ala Gly Val Ser 
1 5 10 15 

Tyr Gin Thr Val Ser Arg Val Val Asn Gin Ala Ser His Val Ser Ala 
20 25 30 

Lys Thr Arg Glu Lys Val Glu Ala Ala Met Ala Glu Leu Asn Tyr lie 
35 40 45 

Pro 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: DNA 

( ix ) FEATURE : 

(A) NAME/ KEY: CDS 
{ B) LOCATION: 1..S4 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:a8: 

CTC GAG AGC GGG CAG GTG GTG CAT GGG GAG CAG GTG GGT GGT GAG GCC 43 
Leu Glu Ser Glv Gin Val Val His Gly Glu Gin Val Gly Gly Glu Ala 
1 ~ 5 10 15 

TCC GGG GCC GTT AAC GGC CGT GGC CTA GCT GGC CAA TAAGTCGAC 93 
Ser Gly Ala Val Asn Gly Arg Gly Leu Ala Gly Gin 
20 25 

(2) INFORMATION FOR SEQ ID NO; 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 
( 3 ) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

Leu Glu Ser Gly Gin Val Val His Gly Glu Gin Val Gly Gly Glu Ala 
1 5 10 15 

Ser Gly Ala Val Asn Gly Arg Gly Leu Ala Gly Gin 
20 25 
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(2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

Lys Trp Ser Gly Leu Gly Gly Gly Arg Val Leu Val Asn 
15 10 



(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 

Arg Arg Trp Ala Thr Ser Gly Pro Arg Gin Leu Tyr 
15 10 



(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNSSS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pepcide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

Glu Pro Lys Phe Lys Asn Phe Arg Val Val Phe Gin Asn 
15 10 



(2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 



Arg Trp Phe Ser Pro Gly Arg Arg Ala Phe Met Val 
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(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 
Gly Arg Pro Phe Arg Gin Asn Ser Pro Val Val Phe 



(2) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

Trp Val Pro Arg Met Glv Arg His Leu Ser Thr Leu • 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPJS: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 

Arg Thr Arg His Val Phe Lva Val lie His Glv Phe 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 
Asn Ala Arg Arg Met Tyr Ser Val Ala Gly Met Asp 



1 
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(2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH z 12 amino acids 
(3) TYPE: amino acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15S: 

Trp Arg Lys Phe Ala Leu Lau Gly Ser Gly Pro Thr 
15 10 

(2) INFORMATION FOR SEQ ID NO: 159 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 

His Arg Ala Tyr Arg lie Ala Thr Met Phe Ser Gly 
15 10 

(2) INFORMATION FOR SEQ ID NO: 160: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: ISO: 

Arg Gly Leu Met Arg Arg Ser Thr Lys Thr Val 
15 10 

(2) INFORMATION FOR SEQ ID NO: 161: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 

Ala Arg His Arg Met Phe Gin Trp Ala Met Val Gly 
15 10 
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(2) INFORMATION FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: .12 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 

lie Met lie Gly Lys Glu Gly Ala Val Ser Ser Ser 
15 10 
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WHAT IS CLAIMED IS: 

1 1. A method of isolating a DNA binding protein 

2 comprising: 

3 (a) providing a recombinant DNA vector comprising a 

4 coding sequence for a peptide having a specific affinity for a 

5 receptor; 

6 (b) inserting a library of oligonucleotides encoding 

7 different potential DNA binding proteins into the vector 

S in- frame with the peptide coding sequence to form a library of 

9 different vectors encoding different fusion proteins, the 

10 fusion proteins differing in the potential DNA binding 

11 protein; 

12 (c) transforming host cells with the vectors ; 

13 (d) culturing the transformed host cells under 

14 conditions suitable for expression of the fusion proteins, 

15 whereby, if a fusion protein comprises a potential DNA binding 

16 protein with affinity for the vector, the fusion protein binds 

17 to the vector to form a complex; 

13 (e) lysing the transformed host cells under 

19 conditions such that complexes formed in (d) remain 

2 0 associated; 

21 (f) contacting the complexes with a receptor under 

22 conditions conducive to specific binding of the peptide to the 

23 receptor; 

24 (g) isolating complexes bound to the receptor, the 

25 complexes containing vectors encoding DNA binding proteins. 

1 2, The method of claim 1, further comprising 

2 isolating the vectors from the complexes in (g) , and repeating 

3 (c)-(g). 

1 3. The method of claim 2, further comprising 

2 determining the sequence of a DNA binding protein encoded by a 

3 vector in (g) . 

1 4. The method of claim 3, further comprising: 

2 transforming the vector in (g) into host cells under 

3 conditions suitable for expression of the fusion protein 
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4 encoded by the vector, whereby the fusion protein binds to the 

5 vector to form a complex; 

6 lysing the transformed host cells under conditions 

7 such that the complex remains associated; 

3 contacting separate samples of the complex to the 

9 receptor and to a receptor lacking affinity for the peptide 

10 under conditions conducive to specific binding of the peptide 

11 to the receptor; 

12 isolating vector from: (1) complex bound to the 

13 receptor and (2) complex bound to the receptor lacking 

14 affinity for the peptide; 

15 separately transforming vector obtained from (1) and 

16 (2) and calculating an enrichment ratio equal to transf ormants 

17 from (1) divided by transf ormants from (2), the enrichment 

18 ratio being a measure of the suitability of the DNA binding 

19 protein for displaying the peptide for specific binding to the 
2 0 receptor. 

1 5. The method of claim 2, wherein the potential DNA 

2 binding proteins are variants of a natural DNA binding 

3 protein. 

1 6. The method of claim 5, wherein the natural DNA 

2 binding protein is lad. 

1 7. The method of claim 6, wherein the vector lacks 

2 a lacO site. 

1 8. The method of claim 7, wherein the potential DNA 

2 binding proteins are variants of a headpiece dimer comprising 

3 two lac headpieces joined by a linker. 

1 9. The method of claim 2, further comprising 

2 contacting the complexes with bulk DNA to compete with the 

3 vectors for binding to the potential DNA binding proteins. 

1 10. A method of constructing a random peptide 

2 library comprising: 
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3 (a) providing a recombinant DNA vector that encodes 

4 a DNA binding protein other than a phage coat protein; 

5 (b) inserting into the coding sequence of the DNA 

6 binding protein a coding sequence for a random peptide such 

7 that the resulting vectors encode fusion proteins, each of 

8 which comprises the DNA binding protein and a random peptide; 

9 (c) transforming host cells with the vectors; and 

10 (d) culturing the transformed host cells under 

11 conditions suitable for expression of the fusion proteins, 

12 wherein the fusion proteins bind via the DNA binding protein 

13 to the vector with sufficient stability that complexes having 

14 a random peptide with a specific affinity for a receptor can 

15 be enriched by affinity purification on the receptor from 

16 complexes lacking a random peptide with a specific affinity 

17 for the receptor. 

1 11. The method of claim 10, wherein the DNA binding 

2 protein is a nonsequenca-specif ic DNA binding protein. 

1 12. A method for screening a random peptide library 

2 for a peptide with specific affinity for a receptor, 

3 comprising: 

4 (a) providing a peptide library wherein each member 

5 is a host cell transformed with a recombinant DNA vector that 

6 encodes a fusion protein comprising a DNA binding protein and 

7 a coding sequence for a random peptide, wherein members differ 

8 from other members with respect to the sequence of the random 

9 peptide, wherein the fusion proteins can bind via the DNA 

10 binding protein to the vector to form complexes having 

11 sufficient stability that complexes having a random peptide 

12 with a specific affinity for a receptor can be enriched by 

13 affinity purification to the receptor from complexes lacking a 

14 random peptide with a specific affinity for the receptor; 

15 (b) lysing the cells transformed with the random 

16. peptide library under conditions such that the fusion protein 

17 remains bound to the vector that encodes the fusion protein; 
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18 ( c ) contacting the fusion proteins of the random 

19 peptide library with a receptor under conditions conducive to 

20 specific peptide-receptor binding; and 

21 < d > isolating the vector that encodes a random 

22 peptide that binds to said receptor. 



1 
2 

1 
2 



1 
2 



1 
2 



1 
2 



13. The method of claim 12, wherein the DNA binding 
protein has been isolated by the method of claim 1. 

14. The method of claim 13, wherein the DNA binding 
protein is a nonsequence-specif ic DNA binding protein. 



1 15. The method of claim 13, wherein the vector lacks 

2 a lacO site. 



16. The method of claim 13, wherein the recombinant 
vector further comprises a DNA sequence with a specific 
3 affinity for the DNA binding protein. 

1 17. The method of claim 12, wherein the host cells 

2 are bacteria. 

1 13. The method of claim 17, wherein the bacteria are 

2 E. coli, and the vector is a plasmid. 

1 19. The method of claim 18, wherein the DNA binding 

2 protein is a lac repressor protein comprising two lac 

3 headpieces joined by a first linker and the DNA binding 

4 protein is joined to the random peptide by a second linker. 



20. The method of claim 19, wherein the first linker 
is GRCR, the two lac headpieces are designated A4 . 5 in Fig. 6 



3 and the second linker is RSQE, 



21. The method of claim 19, wherein the first linker 
is GRCR, the two lac headpieces are designated B4 . 5 in Fig. 6, 



3 and the second linker is GPNQ. 
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1 22. The method of claim 12, wherein the random 

2 peptide is located at the carboxy terminus of said fusion 

3 protein . 

1 23. The method of claim 12 , wherein the library has 

2 at least 10 6 different members. 

1 24. The method of claim 12 further comprising: 

2 (e) transforming a host cell with the vectors 

3 obtained in (d) ; and repeating (b) , (c) , and (d) with the host 

4 cells transformed in (e) . 

1 25. A recombinant DNA vector for constructing the 

2 random peptide library of claim 10, said vector comprising: 

3 (a) a DNA sequence encoding the DNA binding protein; 

4 (b) a promoter positioned so as to drive 

5 transcription of the DNA binding protein coding sequence; 

6 (c) a coding sequence for a peptide inserted in the 

7 DNA binding protein coding sequence so that the coding 

8 sequences can be transcribed to produce an RNA transcript that 

9 can be translated to produce a fusion protein capable of 
10 binding to at least one DNA sequence in the vector. 

1 26 . A host cell transformed with the vector of 

2 claim 25. 

1 27. A random peptide library comprising at least 10° 

2 different members, wherein each member is a host cell 

3 transformed with a recombinant DNA vector that encodes a 

4 fusion protein comprising a DNA binding protein other than a 

5 phage coat protein and a random peptide; and wherein members 

6 differ from other members with respect to the sequence of the 

7 random peptide, wherein the fusion proteins can bind via the 

8 DNA binding protein to the vector to form complexes having 

9 sufficient stability that complexes having a random peptide 

10 with a specific affinity for a receptor can be enriched by 

11 affinity purification to the receptor from complexes lacking a 

12 random peptide with a specific affinity for the receptor. 
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(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 90 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS: double 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

( ix ) FEATURE : 

(A) NAME /KEY: CDS 
(3 ) LOCATION: 1. .60 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

CTC GAG AGO GGG GAG GTG GTG CAT GGG GAG CAG GTG GGT GGT GAG GCC 48 
Leu Glu Ser Gly Gin Val Val His Gly Glu Gin Val Gly Gly Glu Ala 
15 10 15 

TCC GGA GGT GGT TAACTAAGTA AAGCTGGCCA ATAAGTCGAC 90 
Ser Gly Gly Gly 
20 

(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A J LENGTH: 20 amino acids 
(B) TYPE: amino acid 
( D J TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

Leu Glu Ser Gly Gin Val Val His Gly Glu Gin Val Glv Gly Glu Ala 
15 10 15 

Ser Gly Gly Gly 
20 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

Lya Pro Val Thr Leu Tyr Asp Val Ala Glu Tyr Ala Gly Val Ser Tyr 
15 10 15 

Gin Thr Val Ser Arg Val Val Asn Gin Ala Ser Hi3 Val Ser Ala Lys 
20 25 30 

Thr Arg Glu Lys Val Glu Ala Ala Met Ala Glu Leu Asn Tyr He Pro 
35 40 45 
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(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

Arg Thr Ser Asn Val He Arg Cys Arg Arg Val Cys Arg Cys Leu Leu 
15 10 15 

Ser Asp Arg Phe Pro Arg Gly Glu Pro Gly Gin Pro Arg Phe Cys Glu 
20 25 30 

Asn Ala Gly Lys Ser Gly Ser Gly Asp Gly Gly Ala Asp 
35 40 

(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

Leu Ser Asn Val He Arg Cys Arg Arg Val Cys Arg Cys Leu Leu Ser 
15 10 IS 

Ast) Arg Phe Pro Arg Giv Glu Pro Gly Gin Pro Arg Phe Cy3 Glu Asn 
20 " 25 30 

Ala Gly Lys Ser Gly Ser Gly Asp Gly 
35 40 

(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Val Ser Tyr Arq Thr Val Ser Arg Val Val Asn Gin Ala Gly His Val 
15 10 15 

Pro Ala Lys Thr Arg Glu Lys Val Val Ala Ala Met Ala Glu Leu Asn 
20 25 30 

Tyr lie Pro 
35 
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(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

Ly3 Pro Val Thr Leu Tyr A3p Val Ala Glu Tyr Ala Gly Val Ser Tyr 
15 10 15 

Arg Thr Val Ser Arg Val Val Asn Gin Ala Ser His Val Ser Ala Lys 
20' 25 30 

Thr Arg Glu Lys Val Glu Ala Ala Thr Ala Glu 
35 40 

(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
{ D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

Met Lys Pro Val Thr Leu Tyr Asp Val Ala Glu Tyr Ala Gly Val Ser 
15 10 15 

Tyr Arg Thr .Val Ser Arg Val Val Asn Gin Ala Ser His Val Ser Ala 
20 25 30 

Lys Thr Arg Glu Lys Val Glu Ala Ala Met Ala Glu Leu Asn Tyr He 
35 40 45 

Pro 

(2) INFORMATION FOR SEQ ID NO:98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

( B) TYPE: amino acid 

<C) STRANDEDNESS: single 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Lys Pro Val Thr Leu Tyr Aso Val Ala Glu Tyr Ala Gly Val Ser Tyr 
1 S * * 10 15 

Gin Thr Val Ser Arg Val Val Asn Gin Ala Ser His Val Ser Ala Lys 
20- 25 30 

Thr Gly Glu Glu Val Glu Ala Ala Met Ala Gly Leu Asn Tyr He Pro 
35 40 45 
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(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

Met Lys Pro Val Thr Lau Tyr Asp .Val Ala Glu Tyr Ala Gly Thr Ser 
15 10 15 

Tvr Gin Thr Pro Ser Arg Val Val Asn Gin Ala Ser His Val Ser Ala 
20 25 30 

Lys Thr Arg Glu Lys Val Glu Ala Ala Met Ala Glu Leu Asn Tyr lie 
35 40 45 

Pro 

(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

Lys Pro Val Thr Leu Tyr Asp Val Ala Glu Tyr Ala Gly Val Ser Tyr 
1 5 10 15 

Arg Thr Val Ser Arg Val Val Asn Gin Ala Ser Leu Val Ser Ala Lys 
20 25 30 

Thr Arg Glu Lys Glu Glu Ala Ala Met Ala Glu Leu Asn Tyr He Pro 

40 45 



35 



(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ing le 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

Lys Pro Val Thr Leu Tyr Asp Val Ala Glu Tyr Ala Gly Val Ser Tyr 
1 5 10 15 

Arg Thr Val Ser Arg Val Val Asn Gin Ala Ser His Val Ser Ala Lys 
20 25 

Thr Arg Glu Lys Val Glu Ala Ala Met Ala Glu Leu Asn Tyr He Pro 
* — 40 45 



35 
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(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

Met Lys Pro Val Thr Leu Tyr Asp Val Ala Glu Tyr Ala Gly Val Ser 
1 5 10 is 

Tyr Gin Thr Asp Ser Arg Val Glu Asn Gin Ala Ser His Val Ser Ala 
20 25 30 

Lys Thr Arg Glu Lys Val Glu Ala Ala Met Ala Glu Leu Asn Tyr lie 
35 40 45 

Pro 

(2) INFORMATION FOR SEQ ID NO: 103: 

( i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

Met Lys Met Val Thr Leu Tyr Asp Val Ala Glu Tvr Ala Gly Val Ser 
1 5 10 " 15 

Tyr Gin Thr Val Ser Arg Val Val Asn Gin Ala Ser His Val Ser Ala 
20 25 30 

Lys Thr Arg Glu Lys Val Glu Ala Ala Met Ala Glu Leu Asn Tyr He 
35 40 45 

Pro 

(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

Lys Pro Val Thr Leu Tyr Aso Val Ala Glu Tyr Ala Gly Val Ser Tyr 
1 5 " 10 15 

Arg Thr Val Ser Arg Val Val Asn Gin Ala Ser His Ala Ser Ala Lys 
20 25 30 

Thr Arg Glu Lys Val Glu Ala Ala Met Thr Glu Leu Asn Tyr He Pro 
35 40 45 



WO 96/40987 PCT/US96/09809 

115 

(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS : single 
{ D ) TOPOLOGY : 1 ine ar 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 5: 

Met Lys Pro Val Thr Leu Tyr Asp Val Ala Glu Tyr Ala Gly Ala Ser 
15 10 IS 

Tyr Gin Thr Val Ser Arg Val Val Asn Gin Ala Ser His Val Ser Ala 
20 25 30 

Lys Thr Arg Glu Lys Val Glu Ala Ala Met Ala Glu Leu Asn Tyr Val 
35 40 45 

Pro 

(2) INFORMATION FOR SEQ ID NO: 106: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 amino acids 
(3) TYPE: amino acid 

( C) STRANDEDNESS : single 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

Lys Pro Val Thr Leu Tyr Asn Val Ala Glu Tyr Ala Gly Val Ser Tyr 
15 10 15 

Gin Thr Val Ser Arg Val Val Asn Gin Ala Ser His Val Ser Ala Lys 
20 25 30 

Thr Arg Glu Lys Val Glv Ala Ala Met Ala Glu Leu Asn Tyr lie Pro 
35 " 40 45 

(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

Xaa Xaa Xaa Xaa Xaa 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 108; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

Xaa Xaa Xaa Xaa 
1 



(2) INFORMATION FOR SEQ ID NO: 109 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

Gly Arg Cys Arg 
1 



(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
( D J TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

Gly Pro Asn Gin Arg Gin Phe Lys Val Val Thr 
15 10 



(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 4 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 



Val Tyr Cys Arg 
1 
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(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH i 11 amino acids 
(3) TYPE: amino acid 

(C) STRANDED NESS : single 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

Asp His Pro Val Arg Gin Phe Lys Val Val Thr 
15 10 



(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acid3 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SSQ ID NO: 113: 

Thr Val Val Leu 
1 



(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

Arg Arg Tyr Pro Arg Gin Phe Lvs Val Val Thr 
15 10 



(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 
{ A ) LENGTH : 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

Lys Met Cys Met 
1 
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(2) INFORMATION FOR SEQ ID NO: 116 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

Pro Ala Gin Ser Arg Gin Phe Lys Val Val Thr 
15 10 



(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS : single 
{ D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

Leu Arg Arg Cys 
1 



(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 
Leu Ser Lys Arg Arg Gin Phe Lys Val Val Thr 



(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
{ D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 
Arg Ser Gin Glu Arg Gin Phe Lys Val Val Thr 
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(2) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

( B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120 : 

Ser Cys Val Pro 
1 



(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B ) TYPE: amino acid 

(C) STRANDEDNESS: single 
{D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

Lvs Arg Arg Val Arg Gin Phe Lys Val Val Thr 
5 10 



(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 ine ar 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

Glu Hia Ala Arg Arg Gin Phe Lys Val Val Thr 
1 5 10 
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(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 94 base pairs 
(3) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA 

{ ix ) FEATURE : 

(A) NAME /KEY: CDS 

( B) LOCATION: 1..34 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

GAA GCG GCG ATG GCG GAG CTG AAT TAC ATT CCC CGG TCG CAG GAG GCC 48 
Glu Ala Ala Met Ala Glu Leu Asn Tyr lie Pro Arg Ser Gin Glu Ala 
15 10 15 

TCC GGG GCC GTT AAC GGC CGT GGC CTA GCT GGC CAA TAAGGTCGAC 94 
Ser Gly Ala Val Asn Gly Arg Gly Leu Ala Gly Gin 
20 25 

(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 
(3) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

Glu Ala Ala Met: Ala Glu Leu Asn Tyr lie Pro Arg Ser Gin Glu Ala 
15 10 15 

Ser Gly Ala Val Asn Gly Arg Gly Leu Ala Gly Gin 
20 25 

(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 122 base pairs 
(3) TYPE: nucleic acid 

(C) STRAND ED NESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:125: 
GAAGCGGCGA TGGCGGAGCT GAATTACATT CCCCGGTCGC AGGAGGCCTC CGGAGGTGGT SO 
NNKNNKNNKN NKNNKNNXNN KNNKNNKNNK NNXNNKTAAC TAAGTAAAGC TGGCCAATAA 120 
GT 122 
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(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126 : 

Glu Ala Ala Met Ala Glu Leu Asn Tyr lie Pro Arg Ser Gin Glu Ala 
15 10 15 

Ser Gly Gly Gly Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

Lys Gin Phe Lys Val Thr Lys Thr 
1 5 

(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRANDED NESS : 3 ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

Phe His Val Thr Gly Lys Ala Trp Cys Pro Leu Arg 
15 10 

(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
( 3 ) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 9: 

Thr Phe Ly3 Val Val Pro Gin Met Glu Gly Met Thr 
15 10 
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(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

Glu Val Gin lie Arg Ser Phe Arg Val Gly Lys Val 
1.5 10 



(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

( B ) TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

Tyr Leu Ser Thr Glu Arg Pro Arg Arg Met Phe His Leu Thr Lys 
15 10 .15 



(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

( B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

Val Arg Met His Lys Val Ser Glu Gin Ser Arg Phe 
15 10 



(2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 
His Ser Arg Ala Phe Arg Ala Thr Lys Ser Val Val 
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(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS i single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

Ar g His His Met Phe Ser Val Thr Arg lie Trp Asp 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

Ala Phe Ala Val Thr Kis Lys Arg Asn Arg Gly Tyr 
1 5 1° 

(2) INFORMATION FOR SEQ ID NO: 13 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 6: 

Arg Ser Leu Ala Gly Arg Arg Phe Arg lie Leu Gly Asn 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

He Glu His Pro Tyr Arg He Asp Arg Met Val Met 
X 5 1° 
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(2) INFORMATION FOR SEQ ID NOil3a: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 

His Arg Ser Leu Pro Ser Thr Arg Arg Phe Arg Leu Thr Lys 
1 5 10 



(2) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 
{ S) TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

Phe Ser Val Val Arg Gly Cys Arg lie Phe Arg lie Asn 
! • 5 10 



(2) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NE SS : single 

( D ) TOPOLOGY : 1 inea r 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 

Gin Phe Arg Val Val Thr Leu Thr Ser Pro Leu Ala 
15 10 



(2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14*: 

Leu Ala Arg Pro Phe Arg Arg Ala Lys Leu Asp Gly 
15 10 
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(2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 
Leu Leu Arg Arg Pro Phe Met Val Asn Arg Asn Thr 



(2) INFORMATION FOR SEQ ID NO: 143 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 143: 

His Arg Tyr Asn Arg Thr Val Gly lie Asn Glu Val 
15 10 



(2) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

( B ) TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 

Arg Arg Arg Arg Asn Cys Gin lie Val Gly Tyr Trp 
15 10 



(2) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 
Arg Gly Leu Met Arg Arg Ser Tyr Lys Thr Val 
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(2) INFORMATION FOR SEQ ID NO: 146: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 
(3)" TYPE: amino acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 

Met Gly Gly Arg Arg Val Arg Leu Ala Arg lie lie Asn 
15 10 

(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 7: 

Ser Gly Arg Pro Phe Arg Met Glu Arg Gin Arg Pro 
15 10 

(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 
(3) TYPE: amino acid 

(C) STRANDEDNESS: 3ingle 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME/KE3f : Region 
{ 3 ) LOCATION: 5 

(D) OTHER INFORMATION: /note= "Xaa is unknown." 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 

Lya Met Val Arg Xaa lie Phe Arg Thr lie Pro Gly 
15 10 

(2) INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B ) TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 

Leu Arg Arg Met Arg Val Val lie Arg 
1 5 
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